MySQL "incorrect string value" error when save unicode string in Django

None of these answers solved the problem for me. The root cause being:

You cannot store 4-byte characters in MySQL with the utf-8 character set.

MySQL has a 3 byte limit on utf-8 characters (yes, it's wack, nicely summed up by a Django developer here)

To solve this you need to:

  1. Change your MySQL database, table and columns to use the utf8mb4 character set (only available from MySQL 5.5 onwards)
  2. Specify the charset in your Django settings file as below:

settings.py

DATABASES = {
    'default': {
        'ENGINE':'django.db.backends.mysql',
        ...
        'OPTIONS': {'charset': 'utf8mb4'},
    }
}

Note: When recreating your database you may run into the 'Specified key was too long' issue.

The most likely cause is a CharField which has a max_length of 255 and some kind of index on it (e.g. unique). Because utf8mb4 uses 33% more space than utf-8 you'll need to make these fields 33% smaller.

In this case, change the max_length from 255 to 191.

Alternatively you can edit your MySQL configuration to remove this restriction but not without some django hackery

UPDATE: I just ran into this issue again and ended up switching to PostgreSQL because I was unable to reduce my VARCHAR to 191 characters.


I had the same problem and resolved it by changing the character set of the column. Even though your database has a default character set of utf-8 I think it's possible for database columns to have a different character set in MySQL. Here's the SQL QUERY I used:

    ALTER TABLE database.table MODIFY COLUMN col VARCHAR(255)
    CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL;

If you have this problem here's a python script to change all the columns of your mysql database automatically.

#! /usr/bin/env python
import MySQLdb

host = "localhost"
passwd = "passwd"
user = "youruser"
dbname = "yourdbname"

db = MySQLdb.connect(host=host, user=user, passwd=passwd, db=dbname)
cursor = db.cursor()

cursor.execute("ALTER DATABASE `%s` CHARACTER SET 'utf8' COLLATE 'utf8_unicode_ci'" % dbname)

sql = "SELECT DISTINCT(table_name) FROM information_schema.columns WHERE table_schema = '%s'" % dbname
cursor.execute(sql)

results = cursor.fetchall()
for row in results:
  sql = "ALTER TABLE `%s` convert to character set DEFAULT COLLATE DEFAULT" % (row[0])
  cursor.execute(sql)
db.close()

If it's a new project, I'd just drop the database, and create a new one with a proper charset:

CREATE DATABASE <dbname> CHARACTER SET utf8;

I just figured out one method to avoid above errors.

Save to database

user.first_name = u'Rytis'.encode('unicode_escape')
user.last_name = u'Slatkevičius'.encode('unicode_escape')
user.save()
>>> SUCCEED

print user.last_name
>>> Slatkevi\u010dius
print user.last_name.decode('unicode_escape')
>>> Slatkevičius

Is this the only method to save strings like that into a MySQL table and decode it before rendering to templates for display?