How to properly backup mediawiki database (mysql) without messing up the data?
I want to backup a mediawiki database stored in a MySQL server 5.1.36 using mysqldump.
Most of the wiki articles are written in spanish and a don't want to mess up with it by creating the dump with the wrong character set.
mysql> status
--------------
...
Current database: wikidb
Current user: root@localhost
...
Server version: 5.1.36-community-log MySQL Community Server (GPL)
....
Server characterset: latin1
Db characterset: utf8
Client characterset: latin1
Conn. characterset: latin1
...
Using the following command:
mysql> show create table text;
I see that the table create statement set the charset to binary:
CREATE TABLE `text` (
`old_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`old_text` mediumblob NOT NULL,
`old_flags` tinyblob NOT NULL,
PRIMARY KEY (`old_id`)
) ENGINE=InnoDB AUTO_INCREMENT=317 DEFAULT CHARSET=binary MAX_ROWS=10000000 AVG_ROW_LENGTH=10240
How should I use mysqldump to properly generate a backup for that database?
mysqldump
and show create table
are usually pretty reliable workhorses and shouldn't change encodings on their own. If they use a "binary" character set, chances are that is actually the character set given to the table (not necessarily the columns, though). Can you re-check the current table character set in a database browser?
I would try a plain, simple mysqldump
into a file, restoring the dump into a mirror database, and take a close look at the special characters. Different from backups taken using phpMyAdmin, those from the command line tend to work very well in my experience.
Backing up a database is a vital thing to do, so bravo for getting to it.
I recommend testing your backup strategy, not just because of the concern over language, but because one of the most repeated infamous worst-disaster-ever scenarios for a great many organizations is the backup that was never tested and was done wrong for years - and wasn't there, working, when it was eventually needed. The WHOLE backup process should be tested.
First, stop all server activity and take an operating system backup. This is never a bad idea except for the down-time of having a server down. Some database systems can let you do an OS backup while the database system is running and still create a perfectly valid backup! PostgreSQL is such a system - I highly recommend it.
As for character sets, Unicode (aka UTF-8) contains all of Spanish, but ISO-8859-1 is widely chosen.
See these references:
Unicode characters and The Unicode Consortium and I thought this one might be helpful too: MySQL Character Set for ISO-8859-1
I also recommend having a second installation and restoring your backup into it. This does three things:
- It helps validate that the backup process itself is working properly
- It provides a place for testing with real-world data, and;
- It provides a sort of additional "backup" that has less overhead to get to than doing a full recovery - this may prove useful at various times for various reasons.
If you don't want to stop your server for backup, I strongly advice that you have a look at Xtrabackup. This tool performs online backup of InnoDB (and XtraDB) tables. I use it on my production servers. It does not stress them too much and can perform incremental backups.