Are encrypted backups a good idea?
I'm in charge of a small set of laptops and wanted to get some sort of automated remote (over WAN) backups going; the backups would be going to a RAID drive. Because we don't really have a secure vault to hold all our drives (if someone wanted, they could smash the windows and take our drives), I was considering on using encryption, maybe something like duplicity (http://duplicity.nongnu.org/). I've discussed this with a few of the developers, but they seem convinced that encryption is a bad idea because a single bad bit could ruin the whole block. However, I haven't actually heard of any horror stories where this happened. What's your opinion? Do the benefits outweight the risks with encryption?
I do this, and it works for me (as in, the backups run fine, and I have done several restores). I use bacula, which supports it. For my money, pointers to getting this right include:
1) The decryption key is kept (unencrypted) on a CD-R, not in my head. There are several copies of the CD-R, and none of them is with me. I only need to do restores once every few months, and frankly I'd probably forget a password that I used that infrequently. This also means my security isn't limited by the length of a memorable passphrase.
2) The backup software needs to support this already; don't start hacking this into your favourite tool, because you may get it wrong, and you won't know until it's vital that it work (ie, restore time).
3) You've a point about bit-flips, but those can ruin any backup. I clean my tape heads every time the drive requests it, rotate the tapes every few years, and above all keep lots of incrementals. If a bit-flip really did ruin yesterday's incremental, I can always go back to the day before, which will save most of my bum.
4) Document the restore procedure. Encryption always complicates things, more if it's done well, and you don't want to have to rediscover the wheel when you're under pressure to get the accounts database back. I wrote a short README with lots of detail that's very specific to my setup (a real step-by-step guide, all pathnames explicitly listed, that sort of thing) and it's burned to the same CDs as the decryption keys.
5) Above all, test it a lot. You should be testing your restores regularly anyway, but once you've done something clever like this it becomes absolutely critical that you have confidence that things are working as they should.
Pros arising from this practice include not having to care when offsite storage loses a tape or two, as - being only human - they do from time to time; securely destroying old tapes is easy (throw in bin); and having all my file systems encrypted on-disc is no longer undermined by having a stack of unencrypted backup tapes in the fire safe next door.
but they seem convinced that encryption is a bad idea because a single bad bit could ruin the whole block
That's not a very good reason for not using encryption. The same is true for most compressed backups. The best way to address the problem would be to use a use a fault tolerant filesystem (fault tolerant file formats are very thin on the ground - mostly because they don't deal with as many failure scenarios as fault tolerant filesystems).
As with any backup, you need to ensure that you can access the resources needed to restore the data, and periodically verify / run test restores.
However you're a lot more likely to lose a laptop than a backup server - so if your data is valuable, your first port of call should be working out how to secure the data on the laptop. That will probably have a lot of impact on your choice of how to secure the backup.
They have their benefits and drawbacks, like any other. The problem with bit-flips ruining everything1 can be worked around by properly verifying backups, and having more than one copy (which is always a good idea anyway -- one on-site for quick restore, and one off-site for DR purposes).
As far as the benefits of encryption itself goes, it really comes down to how bad it would be if the backups got stolen. Most companies these days have enough critically-sensitive data that it's worth at least doing a pilot project (to learn about the practical issues of managing the thing) and doing an ROI analysis on the whole thing.
- Pactically speaking, blocks die in disks, not bits, and if you did have an undetectable bit-flip in a non-encrypted backup chances are you've probably silently corrupted something important anyway.
I use rsync over ssh to back up 100Gb of remote webserver data every night - it takes just a few minutes to transfer differences and keep a local mirror in sync. However, this relies on the destination not being encrypted.
Once the data has been received, it can be archived using tar, compressed with gzip, (optionally tested with gzip -t) and then optionally encrypted and renamed with the date and stored on a raid system. (I found it easier to store the whole thing than mess around with incrementals).
If the raid drive uses an encrypted file system then it should be unnecessary to further encrypt the backups. If the drives are stolen, then they should remain unreadable but if they take the whole system and the key is available anywhere then it is a moot point.
You could encrypt the files individually, but their security is only as safe as the location of the key.
You could serve the key over https from a remote webserver as a function of the source ip address, that way if the system and backups were stolen then you can still have some control over the key and can disable it in time.
Always keep multiple copies, locally and remotely.