how to backup a django db
I have a Django application that uses a Postgres database. I need to be able to backup and restore the db, both to ensure no data is lost and to be able to copy data from the production server to the development server during testing.
There seem to be a few different ways to do this:
Just interact with the db directly. So, for Postgres I might write a script using
pg_dumpall
andpsql
.Use the
sqlclear
/sqlall
commands that come with Django.Use the
dumpdata
/loaddata
commands that come with Django. So create new fixtures from the db you want to backup and then load them into the db you want to restore.Use a Django plugin like django-dbbackup.
I really don't understand the pros/cons of these different techniques.
Just off the top of my head: Option 1 is database-specific and option 3 seems more suited to setting up initial data. But I'm still not sure what advantages option 4 has over option 2.
The problem with options 1-3 are that media files (anything uploaded through FileField
) are not included in the backup. It is possible to separately backup the directory containing the media files. However, because Django doesn't remove files when they are no longer referenced by a FileField
, you will inevitably end up with files in the backup that don't need to be there.
That's why I would go with option #4. In particular, I recommend django-archive*. Some of its features include:
Dumps the contents of all important models (by default
ContentType
,Permission
, andSession
are excluded since they are populated bymanage.py migrate
) and lets you choose additional models to exclude.Includes media files referenced by
FileField
andImageField
fields. Note that only the files referenced by rows in the database are included; files left over by deleted rows are ignored.Produces a single archive containing both the database backup and media files.
Provides options for customizing the location where archives should be stored, the filename format, and archive type (
gz
andbz2
).
Installation is as simple as adding django_archive
to INSTALLED_APPS
and setting options in settings.py
if needed. Once installed, you can immediately create an archive of your entire database (including media files) by running:
./manage.py archive
* Disclaimer: I am the author of the package
For regular backups I'd go for option 1, using PostgreSQL's own native tool, as it is probably the most efficient.
I would argue that option 2 is primarily concerned with creating the tables and loading initial data so is not suitable for backups.
Option 3 can be used for backups and would be particularly useful if you needed to migrate to a different database platform since the data is dumped in a non-SQL form, i.e. JSON understood by Django.
Option 4 the plugin appears to be using db's own backup tools (as per option 1) but additionally provides help to push your backups into cloud storage in Amazon S3 or Dropbox