How to use UTF-8 in resource properties with ResourceBundle
I need to use UTF-8 in my resource properties using Java's ResourceBundle
. When I enter the text directly into the properties file, it displays as mojibake.
My app runs on Google App Engine.
Can anyone give me an example? I can't get this work.
Java 9 and newer
From Java 9 onwards property files are encoded as UTF-8 by default, and using characters outside of ISO-8859-1 should work out of the box.
Java 8 and older
The ResourceBundle#getBundle()
uses under the covers PropertyResourceBundle
when a .properties
file is specified. This in turn uses by default Properties#load(InputStream)
to load those properties files. As per the javadoc, they are by default read as ISO-8859-1.
public void load(InputStream inStream) throws IOException
Reads a property list (key and element pairs) from the input byte stream. The input stream is in a simple line-oriented format as specified in load(Reader) and is assumed to use the ISO 8859-1 character encoding; that is each byte is one Latin1 character. Characters not in Latin1, and certain special characters, are represented in keys and elements using Unicode escapes as defined in section 3.3 of The Java™ Language Specification.
So, you'd need to save them as ISO-8859-1. If you have any characters beyond ISO-8859-1 range and you can't use \uXXXX
off top of head and you're thus forced to save the file as UTF-8, then you'd need to use the native2ascii tool to convert an UTF-8 saved properties file to an ISO-8859-1 saved properties file wherein all uncovered characters are converted into \uXXXX
format. The below example converts a UTF-8 encoded properties file text_utf8.properties
to a valid ISO-8859-1 encoded properties file text.properties
.
native2ascii -encoding UTF-8 text_utf8.properties text.properties
When using a sane IDE such as Eclipse, this is already automatically done when you create a .properties
file in a Java based project and use Eclipse's own editor. Eclipse will transparently convert the characters beyond ISO-8859-1 range to \uXXXX
format. See also below screenshots (note the "Properties" and "Source" tabs on bottom, click for large):
Alternatively, you could also create a custom ResourceBundle.Control
implementation wherein you explicitly read the properties files as UTF-8 using InputStreamReader
, so that you can just save them as UTF-8 without the need to hassle with native2ascii
. Here's a kickoff example:
public class UTF8Control extends Control {
public ResourceBundle newBundle
(String baseName, Locale locale, String format, ClassLoader loader, boolean reload)
throws IllegalAccessException, InstantiationException, IOException
{
// The below is a copy of the default implementation.
String bundleName = toBundleName(baseName, locale);
String resourceName = toResourceName(bundleName, "properties");
ResourceBundle bundle = null;
InputStream stream = null;
if (reload) {
URL url = loader.getResource(resourceName);
if (url != null) {
URLConnection connection = url.openConnection();
if (connection != null) {
connection.setUseCaches(false);
stream = connection.getInputStream();
}
}
} else {
stream = loader.getResourceAsStream(resourceName);
}
if (stream != null) {
try {
// Only this line is changed to make it to read properties files as UTF-8.
bundle = new PropertyResourceBundle(new InputStreamReader(stream, "UTF-8"));
} finally {
stream.close();
}
}
return bundle;
}
}
This can be used as follows:
ResourceBundle bundle = ResourceBundle.getBundle("com.example.i18n.text", new UTF8Control());
See also:
- Unicode - How to get the characters right?
Given that you have an instance of ResourceBundle and you can get String by:
String val = bundle.getString(key);
I solved my Japanese display problem by:
return new String(val.getBytes("ISO-8859-1"), "UTF-8");