Automating duplicity backups using cron
Background
Original Reference: http://peterpetrakis.blogspot.com/2013/06/automating-and-encrypting-duplicity.html
Having suffered data loss in the past and hacking on storage suggests that it's a good idea to have regular backups. I wanted redundancy in case my local server failed and I wanted to encrypt my backups using a password protected gpg key.
The current solution uses a passphrase kept in plain text outside of the backup path. I plan to investigate moving the gpg key to a smartcard and using a pin key to unlock it instead. If anyone has any additional solutions please describe them in detail.
Persisting requisite environmental variables
Running anything from cron detaches it from your current environment, you lose all of the variables describing things like your ssh-agent gpg-agent, stuff you need to begin to communicate with the remote server.
I took a simple approach, in my ~.bashrc I created the following.
cat > ~/.backenvrc << EOF # used by crontab backup script export SSH_AGENT_PID=$SSH_AGENT_PID export SSH_AUTH_SOCK=$SSH_AUTH_SOCK export GPG_AGENT_INFO=$GPG_AGENT_INFO export GPGKEY=XXX-insert-your-gpg-key-here-XXX EOF
and simply source this from the backup script referenced in my crontab, I merely need only login once to populate this file.
Setting up the Crontab
# crontab -l # m h dom mon dow command MAILTO=ppetraki@localhost BACKUP=/home/ppetraki/Documents/System/Backup # 0 0 * * * /usr/bin/crontab -l > $BACKUP/crontab-backup 0 0 * * * /usr/bin/dpkg --get-selections > $BACKUP/installed-software 0 0 * * * /usr/local/bin/ppetraki-backup.sh inc 0 0 * * Fri /usr/local/bin/ppetraki-backup.sh full
Note that I am also backing up my crontab and my list of installed software, eventually I will move this into another script that also does things like
1) backup my bookmarks from chrome and firefox
2) backup mail in a non-binary format
The current cron format performs an incremental backup every night and a full backup every Friday.
Driver script
This wraps the invocation of duplicity and acquires the necessary environmental variables. Duplicity itself can be hairy with all the command line switches and even more of a burden if you have multiple targets. I have redundant backups, first to a local server and to a remote service provided by rsync.net (great customer support!). I found horcrux to be a wonderful, lightweight, duplicity wrapper to suit my needs.
The driver script, which is external to my backup path, also contains my GPG passphrase to encrypt my backups. Eventually I wish to move to a smartcard driven system [illustrated here] (http://blog.josefsson.org/2011/10/11/unattended-ssh-with-smartcard/)
[/usr/local/bin/ppetraki-backup.sh]
#!/bin/bash export PATH=$PATH:/usr/local/bin action=$1 export USER=XXX export HOME=/home/$USER source $HOME/.backenvrc echo "verifying environment" echo "gpg-agent: ${GPG_AGENT_INFO}" echo "gpg-key: ${GPGKEY}" echo "ssh-agent-pid: ${SSH_AGENT_PID}" echo "ssh-auth-sock: ${SSH_AUTH_SOCK}" if [ -z $action ]; then echo "requires an action!" exit 1 fi export PASSPHRASE= [ -z $PASSPHRASE ] && exit 1 echo "begin" for config in local_backup remote_backup do horcrux clean $config horcrux $action $config done
Using horcrux to wrangle duplicity
Horcrux has the notion of profiles that takes all the complexity out of managing the duplicity CLI. Here's an example of a profile.
cat /home/ppetraki/.horcrux/local_backup-config destination_path="rsync://192.168.1.XXX/backups/personal"
cat ~/.horcrux/local_backup-exclude - /home/ppetraki/Sandbox - /home/ppetraki/Bugs - /home/ppetraki/Downloads - /home/ppetraki/Videos - /home/ppetraki/.xsession-errors - /home/ppetraki/.thumbnails - /home/ppetraki/.local - /home/ppetraki/.gvfs - /home/ppetraki/.systemtap - /home/ppetraki/.adobe/Flash_Player/AssetCache - /home/ppetraki/.thunderbird - /home/ppetraki/.mozilla - /home/ppetraki/.config/google-googletalkplugin - /home/ppetraki/.config/google-chrome - /home/ppetraki/.cache - /home/ppetraki/**[cC]ache*
I found it problematic to backup only sub directories of things like mozilla and google-chrome, instead I will write an additional script to cherry pick those files for backup.
The main horcrux config file
cat ~/.horcrux/horcrux.conf source="/home/ppetraki/" # Ensure trailing slash encrypt_key=XXXXXX # Public key ID to encrypt backups with sign_key='-' # Key ID to sign backups with (leave as '-' for no signing) use_agent=false # Use gpg-agent? remove_n=3 # Number of full filesets to remove verbosity=5 # Logs all the file changes (see duplicity man page) vol_size=25 # Split the backup into 25MB volumes full_if_old=30D # Cause 'full' operation to perform a full # backup if older than 360 days backup_basename='backup' # Directory name for local backups (i.e., destination # /Volumes/my_drive/backup/ or /media/my_drive/backup/) dup_params='--use-agent' # Parameters to pass to Duplicity
This is great as it reduces a backup invocation to this:
$ horcrux inc local_backup
Monitoring
I defined MAILTO in my crontab and also installed mutt and the reconfigured postfix for local mail delivery. Every night I get a progress report on how the backups ran.
Conclusion
I've spent quite a bit of time determining how to automate this in and provide strong encryption. I hope you find this useful.