Using Rsync to backup to an external drive

I'm buying an external hard drive to back up the computers in my house (finally!!). I'm hoping to use rsync. I've seen an example that does (or seems to do) exactly what I want. Something like this:

rsync -aE --delete /path/to/what/I/want/to/backup /Volumes/FW200/Backups

However, when looking at the rsync documentation and examples and so on, things started to look MUCH more complicated than that. Networking and daemons and jargon, oh my!

I'm assuming that none of that stuff is necessary as long as I'm just rsyncing from a computer to a firewire-connected external drive. I'm I wrong in assuming that? Are things really going to be more complicated than that innocuous command?


Rsync works fine across local drives. However, if it detects local paths it automatically goes into --whole-file mode which does not copy the diffs, but just copies the source file over the destination file. Rsync will still ignore files that haven't changed at all though. When bandwidth between the source and destination is high (like two local disks) this is much faster than reading both files, then copying just the changed bits.


I use rsync with the following flags handily memorable as 'glop' and 'trunc' and 'v'.

rsync -gloptrunc $srcdir $dstdir

A brief guide:

  • g - preserve group ownership info
  • l - copy symlinks as symlinks
  • o - preserve owner info
  • p - preserve permissions
  • t - preserve timestamps
  • r - recurse thru directories
  • u - update, skip any newer files
  • [n] - no, dont do this, do a dry run instead
  • c - checksum, attempt checksums on file blocks when possible (*)
    note: on local filesystems, this get overridden and entire files are copied instead.
  • v - verbose

I always run the above to make sure it works, then remove the 'n' flag that once I'm happy with the results.

The key features of the above combinations:

  • I run it in BOTH directions between two (or more) servers, thus syncing in BOTH directions. You update whichever you feel is the master at the time.
  • It allows either to be the master, with the significant caveat that if you want to delete something, you must delete it on both to be sure it's really gone, else it comes back.

I use this to keep two machines in sync, or to keep to subdirs in sync (like backing up to a USB drive).

As one of the other posts stated earlier, the 'checksum' may actually be forced OFF if you are dealing with local drives.

In some rare instances, I've had to add additional parameters to account for changes in login accounts across remote machines, changing ports, and even specifying where 'rsync' lives on the remote host... but those are not directly applicable to your question.


None of it is necessary, you can use rsync without any daemons or any other kind of configuration JUST FINE!

Just use the rsync command and you are good to go.


Judging by the path in your rsync command, would I be right in thinking you're using Mac OS X?

Personally, I'd opt for using Time Machine (if you're using Leopard), or Carbon Copy Cloner (http://www.bombich.com/software/ccc.html) which uses rsync.

Much easier than trying to right your own script. One advantage is that Time Machine and CCC will both give you incremental backups.


The example you used looks like it will work just fine for backups.

One thing you might want to consider when using rsync however is to make use of the --link-dest option. This lets you keep multiple backups, but use hard links for any unchanged files, effectively making all backups take the space of an incremental. An example use would be:

rsync -aE --link-dest=/mnt/external_disk/backup_20090612 dir_to_backup \
    /mnt/external_disk/backup_20090613

This assumes you have a dated backup for June 12 and you want to create a new one on June 13. You might want to omit the -v option if you don't want a printout of every file.