btrfs-enabled backup solution
With btrfs hitting production in Oracle EL 14th this month (together with working fsck and scrubbing from Linux 3.2) I was thinking of redesigning my current backup solution to utilise it. Note that I'm thinking about doing it for small amounts of data, less than 10TB, that's fairly static (less than 1% changed daily). In short a SMB/SOHO backup solution.
What the backup should do:
- do a LVM snapshot of ext[234]/XFS/JFS on the production server
-
rsync
/transfer changed data to btrfs on backup server - snapshot the btrfs filesystem
- drop old snapshots when free space is running low
Pros:
- All files easily available, no decompression or loop mounting needed
- Past snapshots also easily available...
- ... so I can share them as read-only Samba shares (with shadow copy support)
- Snapshots take minimal amount of space thanks to copy-on-write (snapshot without changes takes literally few KiB on disk)
- High backup consistency: checksums on files, scrubbing of all data and built-in redundancy
Questions:
- Is there some backup solution (in form of Bacula, BackupPC, etc.) that is, or can be easily made, aware of copy-on-write file system?
- Or will I need to use in-home
rsync
solution? - What do people with ZFS boxes dedicated for backup do to backup their Linux machines?
I've done some extensive searching in the last week for something similar. I have found no solutions to do all 4 steps. There are numerous blogs from home users who try the 'rsync to btrfs'-type of backups, and all of the major Btrfs wikis cover how to perform Btrfs snapshots.
There are also quite a few people who are attempting different ways of rotating Btrfs snapshots. However, you are the first person I've seen who wants to rotate snapshots based on disk space. I am playing with btrfs-snap myself which creates a set of hourly, weekly and monthly snapshots, and it's nice and simple.
The Dirvish project seems to meet many of your requirements. Some developers are attempting to integrate Dirvish with Btrfs. However, the Dirvish project seems a bit stalled.
At this point in time, you are ahead of the curve.
According to Avi Miller (his talk during LinuxConf.AU) a btrfs send/receive is being worked on. It'll be faster than rsync since it doesn't need to traverse through directories to find changes in files.. I don't know if there's an expected release date yet though.
There is, however, a utility built into btrfs-progs that lists every file that has changed between snapshots/etc.. btrfs subvolume find-new