Should static data be backed up every time to tape?
In the Backup and Recovery book, they write it is good practice to make a full backup every month, and then increments or differential back up each week.
What if I have 800GB data and ~10GB changes per week.
Should I still make a full backup each month?
I mean, on LTO tapes they guarantee data integrability for 30 years.
So why make full back ups each time?
That is generic guidance. Specific guidance is much better.
The big questions you need to have an answer to before you start setting up your backup retention schedule is:
How much data am I willing to lose, and how long am I willing to take to recover what I can?
Tape backup is near the bottom of the backup/disaster-recovery hierarchy. Very roughly, that is (and I'm sure I'll forget a few steps):
- RAID (data-loss prevention)
- Traditional data backup
- Multi-site data backup
- Data replication
- Cold fail-over services
- Hot fail-over services
- Load-balanced replicated services
- Multi-site replication
- Multi-site cold failover services
- Multi-site hot failover services
- Multi-site load-balanced replicated services
We're talking about steps 2 and 3 here. How fast you want your data back depends on several factors:
- How much of it you have
- How many backup sets you have to go through to get it all back
- What those backup sets are stored on
- How fast the hardware supporting all of this (both servers, network, and backup hardware) can run
- Whether or not the backup system can do a 'differential' backup, or is it just Full/Incremental
In case you hadn't run into the term before a Differential backup is defined as "everything that has changed since the last full backup". I think the term originated with BackupExec and has since been adopted elsewhere. But I digress.
In the book's backup scheme, one full a month, net-change daily the rest of it, the worst-case disaster recovery scenario is a data-loss event the day before the full backup is taken. Recovering in that case will require:
- The last Full backup, 29 days ago
- Every single tape since then, all 28 of them.
Depending on the aforementioned variables, this could take a really long time to recover.
Take an alternate scenario, Full on Friday, net-change the other 6 days. The worst-case recovery here is a loss event Friday afternoon. Recovering in that case will need:
- Last Friday's tape
- The other 6 tapes
This should take a lot less time.
One thing that hasn't been covered is what happens when a backup tape is bad. With the 30-day between fulls scenario, a bad tape can cost you anywhere from 1 to 59 days of data-loss. If that's unacceptable, run your full backups more often.
One thing some backup to disk vendors are selling these days is something called a synthetic full backup. How it works is that you take an initial full backup and then do net-change forever more. On a set schedule you do a synthetic full backup which coalesces a week/two-weeks/months worth of net-change with the last full backup to come up with a virtual full backup. This is handy for staying within backup windows.
When doing a hybrid disk/tape system, you do your weekly/monthly backups to disk, and then spool archive sets off to tape to sit on a shelf for 3/5/7/10 years. When used in combination with something that can do a synthetic full, a synthetic full can be spun to tape and sent off-site on a regular schedule. Hybrid systems offer the most flexibility these days, and I recommend them whenever possible. Disk for short-term, tape for long term.
(what mailq said) plus: Doing incremental forever is not common practice with tapes, as you can lose the tape with full backup on it and render your entire backup useless.
The shift right now is to do full backup + incermental forever against disk backup with deduplication.. this can basically run forever, and you normally run RAID6 in the bottom wich can tolerate 2 disks failing. That, plus weekly/monthly/quartely/yearly tape backups stored in some vault far, far below ground.