Why charset names are not constants?

The simple answer to the question asked is that the available charset strings vary from platform to platform.

However, there are six that are required to be present, so constants could have been made for those long ago. I don't know why they weren't.

JDK 1.4 did a great thing by introducing the Charset type. At this point, they wouldn't have wanted to provide String constants anymore, since the goal is to get everyone using Charset instances. So why not provide the six standard Charset constants, then? I asked Martin Buchholz since he happens to be sitting right next to me, and he said there wasn't a really particularly great reason, except that at the time, things were still half-baked -- too few JDK APIs had been retrofitted to accept Charset, and of the ones that were, the Charset overloads usually performed slightly worse.

It's sad that it's only in JDK 1.6 that they finally finished outfitting everything with Charset overloads. And that this backwards performance situation still exists (the reason why is incredibly weird and I can't explain it, but is related to security!).

Long story short -- just define your own constants, or use Guava's Charsets class which Tony the Pony linked to (though that library is not really actually released yet).

Update: a StandardCharsets class is in JDK 7.

Two years later, and Java 7's StandardCharsets now defines constants for the 6 standard charsets.

If you are stuck on Java 5/6, you can use Guava's Charsets constants, as suggested by Kevin Bourrillion and Jon Skeet.

I'd argue that we can do much better than that... why aren't the guaranteed-to-be-available charsets accessible directly? Charset.UTF8 should be a reference to the Charset, not the name as a string. That way we wouldn't have to handle UnsupportedEncodingException all over the place.

Mind you, I also think that .NET chose a better strategy by defaulting to UTF-8 everywhere. It then screwed up by naming the "operating system default" encoding property simply Encoding.Default - which isn't the default within .NET itself :(

Back to ranting about Java's charset support - why isn't there a constructor for FileWriter/FileReader which takes a Charset? Basically those are almost useless classes due to that restriction - you almost always need an InputStreamReader around a FileInputStreamor the equivalent for output :(

Nurse, nurse - where's my medicine?

EDIT: It occurs to me that this hasn't really answered the question. The real answer is presumably either "nobody involved thought of it" or "somebody involved thought it was a bad idea." I would strongly suggest that in-house utility classes providing the names or charsets avoid duplication around the codebase... Or you could just use the one that we used at Google when this answer was first written. (Note that as of Java 7, you'd just use StandardCharsets instead.)

Why charset names are not constants?

Related

Recent Posts