What is the best way to backup a Linux webserver? [closed]

I am running a typical LAMP server. I need to back up the data on this machine over the network without service interruption. The backup system understands SSH, FTP, SMB, NFS & iSCSI. What would be the best approach to accomplish this?


You could use 'scp' (which uses SSH) to backup the data but the better option to look into is setting up 'rsync' : Replicating Webservers

rsync is pretty fast as it only mirrors the difference as opposed to doing a full copy.


Rsync is the best thing ever. However, as a general purpose backup tools it has a couple of failings (not failings in itself, its a fine file-copy tool). You have 2 issues:

  1. backup sets will be made over time, so the first file copied will be older than the last file copied. Yup, obvious, but if both files need to be in sync, then you have a backup that doesn't reflect the server state. This might not be an issue most times, but if it ever is, you will have a problem when you restore.

  2. some data doesn't copy well, eg a MySQL DB file. If the file is open, and some of its contents are in-memory, then the true state of the server will not be backed up.

The solution is reasonably simple: ensure all running services that are important flush their data to disk immediately before backup, and then ensure you take a snapshot of the disk state.

MySQL has tools to dump the databases - mysqldump. I use this to create a backup sql file that is backed up, I ignore the mysql files themselves after this, when I come to restore, I know I can restore the dumps. LVM has the facilities to take disk snapshots. These create a spare partition and all disk writes from the time the snapshot is taken transparently go to the snapshot partition, meaning your original drive remains unchanged. After you have taken the backup, delete the snapshot and all changes are 'committed' to the main disk.

Alternatively, you could use a virtualization system to hold your web server, then backups involve suspending the VM image, and copying its files (using rsync!) to a backup destination. Restoring is as simple as copying the backup files back and un-suspending them (you will have a short downtime while this happens, though some VM systems can minimise the downtime if you suspend and use a snapshot filesystem).

The best alternative, if you have money, is to use something like r1soft's continuous data protection, which constantly backs up changes.


Rsync is great, but rdiff-backup is even better. Not only does it keep a mirror of all your files, it also allows you to restore old versions from previous backups if you want. It only saves the parts of the file that changed (the "reverse diffs"), so you're not saving a whole bunch of extra data to get the restore functionality.

It uses the same algorithms as rsync, but it's quite a bit more powerful and useful.

Also, duplicity is a system that works much like rdiff-backup, but it does everything on the client side (i.e. it doesn't need to be installed on the server). It can also encrypt the backups before you send them, and it can be configured to work with work with Amazon S3 storage.


Rsync is the way to go for simple setups.

For bigger setups, consider BackupPC :-)