How to properly backup mediawiki database (mysql) without messing up the data?

I want to backup a mediawiki database stored in a MySQL server 5.1.36 using mysqldump.

Most of the wiki articles are written in spanish and a don't want to mess up with it by creating the dump with the wrong character set.

mysql> status
--------------
...
Current database:       wikidb
Current user:           root@localhost
...
Server version:         5.1.36-community-log MySQL Community Server (GPL)
....
Server characterset:    latin1
Db     characterset:    utf8
Client characterset:    latin1
Conn.  characterset:    latin1
...

Using the following command:

mysql> show create table text;

I see that the table create statement set the charset to binary:

CREATE TABLE `text` (
  `old_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `old_text` mediumblob NOT NULL,
  `old_flags` tinyblob NOT NULL,
  PRIMARY KEY (`old_id`)
) ENGINE=InnoDB AUTO_INCREMENT=317 DEFAULT CHARSET=binary MAX_ROWS=10000000 AVG_ROW_LENGTH=10240

How should I use mysqldump to properly generate a backup for that database?


mysqldump and show create table are usually pretty reliable workhorses and shouldn't change encodings on their own. If they use a "binary" character set, chances are that is actually the character set given to the table (not necessarily the columns, though). Can you re-check the current table character set in a database browser?

I would try a plain, simple mysqldump into a file, restoring the dump into a mirror database, and take a close look at the special characters. Different from backups taken using phpMyAdmin, those from the command line tend to work very well in my experience.


Backing up a database is a vital thing to do, so bravo for getting to it.

I recommend testing your backup strategy, not just because of the concern over language, but because one of the most repeated infamous worst-disaster-ever scenarios for a great many organizations is the backup that was never tested and was done wrong for years - and wasn't there, working, when it was eventually needed. The WHOLE backup process should be tested.

First, stop all server activity and take an operating system backup. This is never a bad idea except for the down-time of having a server down. Some database systems can let you do an OS backup while the database system is running and still create a perfectly valid backup! PostgreSQL is such a system - I highly recommend it.

As for character sets, Unicode (aka UTF-8) contains all of Spanish, but ISO-8859-1 is widely chosen.

See these references:

Unicode characters and The Unicode Consortium and I thought this one might be helpful too: MySQL Character Set for ISO-8859-1

I also recommend having a second installation and restoring your backup into it. This does three things:

  1. It helps validate that the backup process itself is working properly
  2. It provides a place for testing with real-world data, and;
  3. It provides a sort of additional "backup" that has less overhead to get to than doing a full recovery - this may prove useful at various times for various reasons.

If you don't want to stop your server for backup, I strongly advice that you have a look at Xtrabackup. This tool performs online backup of InnoDB (and XtraDB) tables. I use it on my production servers. It does not stress them too much and can perform incremental backups.