What is a good schedule / methodology for test-restoring my backups?
Depending on the environment, this could be tricky.
In a dedicated-server environment, it would be most helpful to have a backup machine whose hardware is identical to the primary. Take the backup machine offline, then run the restores on it while it's isolated. Once you're satisfied that it restored properly, ensure that the whole process was properly documented. If you're really serious, have someone else try to follow your instructions with no outside help.
The acid test is in swapping the backed-up machine in for the primary. This won't work, of course, if you work with live data which is constantly changing. It's also risky and not entirely necessary, but it guarantees you will be able to recover with your (newly) established procedures.
Since I don't have a backup machine, and I can afford to off-line my server for hours, I do the following:
- Run a fresh backup
- Replace all of the relevant hard drives with spares. Label and keep the originals in case things go badly.
- Run the restore procedure, documenting along the way.
- In this case, just return the system to service. The original HDDs go on the shelf as spares, and a really last-ditch recovery option.
I won't attempt to comment on VM environments. I know only enough about them to be dangerous.
Be sure to review and test your backup and restore system whenever you make significant changes to your server. It wouldn't do to have a problem, dust off the 5-year-old binder with the recovery instructions, only to realize:
- it's for hardware you no longer have,
- software which has undergone drastic changes three times,
- and doesn't even talk about the four new roles you've added to this server since the last time the book was updated.
In closing, the key is to carefully document the entire process. Write it out in such a way that the new hire that just started last week fresh out of school can successfully get things going again with no outside help. Then test it.
Good luck!