UTF-8 characters in 'filename' for 'Content-Disposition' yield "IllegalArgumentException: Unexpected char"
The default character set for HTTP headers is ISO-8859-1. There is however RFC 6266, describing how you can encode the file name in a Content-Disposition
header. Basically, you specify the character set name and then percent-encode the UTF-8 characters. Instead of fileName="my-simple-filename"
you use a parameter starting with filename*=utf-8''
like
import java.net.URLEncoder;
// ...
String fileName = "3$ Mù F'RANçé_33902_Country_5_202105";
String contentDisposition = "attachment;filename*=utf-8''" + encodeFileName(fileName);
// ...
private static String encodeFileName(String fileName) throws UnsupportedEncodingException {
return URLEncoder.encode(fileName, "UTF-8").replace("+", "%20");
}
Using the URL encoder and then modifying the result for "+" is a cheap trick I found here, if you want to avoid using Guava, Spring's ContentDisposition
class or any other library and simply work with JRE classes.
Update: Here is a full MCVE, showing how to send an UTF-8 string both as a POST body and as a content disposition file name. The demo server shows how to decode that header manually - usually HTTP servers should do that automatically.
Maven POM showing used dependencies:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>SO_Java_OkHttp3SendUtf8_70804280</artifactId>
<version>1.0-SNAPSHOT</version>
<properties>
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>com.squareup.okhttp3</groupId>
<artifactId>okhttp</artifactId>
<version>4.9.3</version>
</dependency>
<dependency>
<groupId>org.nanohttpd</groupId>
<artifactId>nanohttpd</artifactId>
<version>2.3.1</version>
</dependency>
</dependencies>
</project>
OkHttp demo client:
import okhttp3.Headers;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;
import java.io.IOException;
import java.net.URL;
import java.net.URLEncoder;
import java.nio.charset.StandardCharsets;
import java.util.Objects;
public class Client {
public static void main(String[] args) throws IOException {
String fileName = "3$ Mù F'RANçé_33902_Country_5_202105";
String contentDisposition = "attachment;filename*=utf-8''" + encodeFileName(fileName);
RequestBody requestBody = RequestBody.create(fileName.getBytes(StandardCharsets.UTF_8));
Headers headers = new Headers.Builder()
.add("Content-Disposition", contentDisposition)
.add("Content-type", "application/octet-stream; charset=utf-8")
.build();
Request request = new Request.Builder()
.headers(headers)
.post(requestBody)
.url(new URL("http://localhost:8080/"))
.build();
OkHttpClient client = new OkHttpClient();
Response response = client.newCall(request).execute();
System.out.println(Objects.requireNonNull(response.body()).string());
}
private static String encodeFileName(String fileName) {
return URLEncoder.encode(fileName, StandardCharsets.UTF_8).replace("+", "%20");
}
}
NanoHTTPD demo server:
import fi.iki.elonen.NanoHTTPD;
import java.io.IOException;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
import java.util.HashMap;
import java.util.Map;
public class Server extends NanoHTTPD {
public Server() throws IOException {
super(8080);
start(NanoHTTPD.SOCKET_READ_TIMEOUT, false);
System.out.println("\nRunning! Point your browsers to http://localhost:8080/ \n");
}
public static void main(String[] args) throws IOException {
new Server();
}
private static final String UTF_8_FILE_NAME_PREFIX = ";filename*=utf-8''";
private static final int UTF_8_FILE_NAME_PREFIX_LENGTH = UTF_8_FILE_NAME_PREFIX.length();
@Override
public Response serve(IHTTPSession session) {
try {
Map<String, String> files = new HashMap<>();
session.parseBody(files);
String postBody = files.get("postData");
String contentDisposition = session.getHeaders().get("content-disposition");
String fileName = decodeFileName(
contentDisposition.substring(
contentDisposition.indexOf(UTF_8_FILE_NAME_PREFIX) + UTF_8_FILE_NAME_PREFIX_LENGTH
)
);
System.out.println("POST body: " + postBody);
System.out.println("Content disposition: " + contentDisposition);
System.out.println("UTF-8 file name: " + fileName);
return newFixedLengthResponse(postBody + "\n" + fileName);
}
catch (IOException | ResponseException e) {
e.printStackTrace();
return newFixedLengthResponse(e.toString());
}
}
private static String decodeFileName(String fileName) {
return URLDecoder.decode(fileName.replace("%20", "+"), StandardCharsets.UTF_8);
}
}
If first you run the server and then the client, you will see this on the server console:
Running! Point your browsers to http://localhost:8080/
POST body: 3$ Mù F'RANçé_33902_Country_5_202105
Content disposition: attachment;filename*=utf-8''3%24%20M%C3%B9%20F%27RAN%C3%A7%C3%A9_33902_Country_5_202105
UTF-8 file name: 3$ Mù F'RANçé_33902_Country_5_202105
On the client console, you see:
3$ Mù F'RANçé_33902_Country_5_202105
3$ Mù F'RANçé_33902_Country_5_202105