Java PreparedStatement UTF-8 character problem

I have a prepared statement:

PreparedStatement st;

and at my code i try to use st.setString method.

st.setString(1, userName);

Value of userName is şakça. setString methods changes 'şakça' to '?akça'. It doesnt recognize UTF-8 characters. How can i solve this problem?

Thanks.


Solution 1:

The number of ways this can get screwed up is actually quite impressive. If you're using MySQL, try adding a characterEncoding=UTF-8 parameter to the end of your JDBC connection URL:

jdbc:mysql://server/database?characterEncoding=UTF-8

You should also check that the table / column character set is UTF-8.

Solution 2:

Whenever a database changes a character to ?, then it simply means that the codepoint of the character in question is completely out of the range for the character encoding as the table is configured to use.

As to the cause of the problem: the ç lies within ISO-8859-1 range and has exactly the same codepoint as in UTF-8 (U+00E7). However, the UTF-8 codepoint of ş lies completely outside the range of ISO-8859-1 (U+015F while ISO-8859-1 only goes up to U+00FF). The DB won't persist the character and replace it by ?.

So, I suspect that your DB table is still configured to use ISO-8859-1 (or in one of other compatible ISO-8859 encodings where ç has the same codepoint as in UTF-8).

The Java/JDBC API is doing its job perfectly fine with regard to character encoding (Java uses Unicode all the way) and the JDBC DB connection encoding is also configured correctly. If Java/JDBC would have incorrectly used ISO-8859-1, then the persisted result would have been Åakça (the ş exist of bytes 0xC5 and 0x9F which represents Å and a in ISO-8859-1 and the ç exist of bytes 0xC3 and 0xA7 which represents à and § in ISO-8859-1).