What is the most efficient way to backup a directory full of large database backup files?

We use NetBackup here at Stack Exchange, and I'm working on revamping our backup policy to be more efficient.

Currently, we use SQL 2008 R2 and have the SQL run maintenance plans to backup the data to a .bak file. Once that file is written, we then backup the directory that the .bak files are stored in.

We don't use the SQL Agent for NetBackup since we use the .bak files for other things beyond just simple backups.

I was thinking of doing a schedule of Weekly/Diff/Cume rotations but considering the fact that the directories will have large files that are guaranteed new every day, and given that our system automatically ages out backups that are older than a certain number of days, I'm thinking that the standard "office fileserver" scenario is probably less efficient than other methods.

Is there a "most efficient" way to handle this?


Solution 1:

I've got very little experience with SQL Server backups, so take all of this with a pound of salt, and investigate SQL Server agents for various backup technologies (Bacula claims to have one) before trying my half-baked scheme below.


My solution to database backups is very much PostgreSQL specific: I mirror to a slave, then when backup time comes I shut that slave down, let Bacula back up the DB directory, and restart the slave so it can catch up on replication.
This has the advantage of fast restores and a fair compromise on backup sizes (Only the table-backing files that changed are backed up, but the backup process does grab the whole table, not just the delta).

Something similar might work in your case. At first brush I'd suggest:

  1. Set up a slave server
  2. Set up a machine at a remote site running an rsync daemon to rsync to.
  3. Every night at backup time shut down the slave and rsync the database files to the remote site, then restart the slave and let it catch up on replication.

This is a very similar setup to what I'm doing, except that by directly rsync'ng your data you can take advantage of rsync's block-level scanning (and hopefully ship proportionally less data over the wire than I do grabbing the full table-backing files).