"Incorrect string value" when trying to insert UTF-8 into MySQL via JDBC?
MySQL's utf8
permits only the Unicode characters that can be represented with 3 bytes in UTF-8. Here you have a character that needs 4 bytes: \xF0\x90\x8D\x83 (U+10343 GOTHIC LETTER SAUIL).
If you have MySQL 5.5 or later you can change the column encoding from utf8
to utf8mb4
. This encoding allows storage of characters that occupy 4 bytes in UTF-8.
You may also have to set the server property character_set_server
to utf8mb4
in the MySQL configuration file. It seems that Connector/J defaults to 3-byte Unicode otherwise:
For example, to use 4-byte UTF-8 character sets with Connector/J, configure the MySQL server with
character_set_server=utf8mb4
, and leavecharacterEncoding
out of the Connector/J connection string. Connector/J will then autodetect the UTF-8 setting.
The strings that contain \xF0
are simply characters encoded as multiple bytes using UTF-8.
Although your collation is set to utf8_general_ci, I suspect that the character encoding of the database, table or even column may be different. They are independent settings. Try:
ALTER TABLE database.table MODIFY COLUMN col VARCHAR(255)
CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL;
Substitute whatever your actual data type is for VARCHAR(255)
Got the same problem, to save the data with utf8mb4
needs to make sure:
character_set_client, character_set_connection, character_set_results
areutf8mb4
:character_set_client
andcharacter_set_connection
indicate the character set in which statements are sent by the client,character_set_results
indicates the character set in which the server returns query results to the client.
See charset-connection.the table and column encoding is
utf8mb4
For JDBC, there are two solutions:
Solution 1 (need to restart MySQL):
-
modify
my.cnf
like the following and restart MySQL:[mysql] default-character-set=utf8mb4 [mysqld] character-set-server=utf8mb4 collation-server=utf8mb4_unicode_ci
this can make sure the database and character_set_client, character_set_connection, character_set_results
are utf8mb4
by default.
restart MySQL
change the table and column encoding to
utf8mb4
STOP specifying
characterEncoding=UTF-8
andcharacterSetResults=UTF-8
in the jdbc connector,cause this will overridecharacter_set_client
,character_set_connection
,character_set_results
toutf8
Solution two (don't need to restart MySQL):
change the table and column encoding to
utf8mb4
specifying
characterEncoding=UTF-8
in the jdbc connector,cause the jdbc connector doesn't suportutf8mb4
.-
write your sql statment like this (need to add
allowMultiQueries=true
to jdbc connector):'SET NAMES utf8mb4;INSERT INTO Mytable ...';
this will make sure each connection to the server, character_set_client,character_set_connection,character_set_results
are utf8mb4
.
Also see charset-connection.