How do you backup a SAN?

long time reader, first time asker :)

I have been reading up a lot on iSCSI and SANs in general and I believe I have been able to answer most of my questions and concerns on the topic, but this one remains:

How do you "backup a SAN?"

What follows is a more or less real-world scenario and my thoughts and questions about it.

Suppose you managed to convince the management of your small (at best medium sized) company to finally provide the funds for a small but proper storage solution, i.e. an iSCSI-based SAN. Suppose it consists of a server with many drives in an enclosure and running OpenFiler, or even an MD3000i (Dell) or MSA2000i (HP), which are, as I understand, the most common entry-level solutions.

The LUNs are exported to a server that needs to store code repositories, documents, images and the like, another server that runs a few databases, another that uses the LUNs as disks for virtualization guests (DomUs in Xen-speak) and yet another server that exports one big LUN containing user's home directories via NFS (this is a pure Linux shop). The advantages are clear, I believe: individual servers don't need a lot of local storage and migration of servers or services gets easier.

But now you need to have a backup solution for all or most of the stored data. How do you do it? Do you run backup software (I like rsnapshot) on each and every server that has data to back up? Where do you put that data? On a dedicated backup server with lots of local storage? Or back in the SAN? What is the "common" solution, if any, for backing up a SAN?

I am looking for best practices and advice from people who have more experience than me running SANs.

Thanks!

Edit: considering that the budget for the SAN is very limited to begin with I guess I am looking for non-proprietary, very general and cheap solutions to the backup question. If such solutions exist, anyway. There will not be any money for tapes or a second, identical SAN array. Should have made that more explicit, sorry.


We use a NetApp 3020 SAN cluster with iSCSI, FC, and CIFS data stored on it. This product supports NDMP dumps to a locally attached SCSI tape autoloader. By using this, I get perfect copies of my iSCSI and FC LUNs as well as file by file backups of my CIFS data being shared from the NetApp. I use BackupExec to control the NDMP backups and the speeds are exceptional because it is a local SCSI connect to the NetApp.


It sounds like we're in a similar boat, in terms of infrastructure size and complexity.

Essentially, I've got a SAN that handles my production data, then I've got a backup server with a pretty decently sized locally attached storage that is attached to a tape library (LTO-3 which is 400GB uncompressed / tape)

Essentially, I do data-level backups. Since I'm running Linux, I do rsyncs to get the data from the SAN-attached-machine to the backup machine, then I write the data to tape. I'm fortunate that I've got enough local storage on the backup server that I can keep a copy locally, then just rsync the differences, but if you can't set that up, lots of backup solutions use the idea of a spooling directory to locally store the data while it's being written to tape.

Because of the way that tape writes, it's a very bad idea to stream from network to tape directly, such as a windows file share or NFS share. That completely kills the tape write speed AND it kills the lifetime of your tape drive. So use a local disk to spool the data onto.

The backup solution I use is called Amanda, which is pretty esoteric in its configuration, but has a commercial version available (for $100/server being backed up) which has a web based configuration, and you can also get extensions to plug directly into various databases.

EDIT

Since you mentioned not having tapes, I would recommend a poor man's virtual tape library (VTL), i.e. external USB drives. Amanda, at least, can address files as if they were a VTL, and I'm sure other software packages can as well.

Really, though, hard drives have a defined lifetime. If your company is spending enough money to buy a SAN, you should work on them to get a tape changer. They're actually not as expensive as they used to be, particularly if you don't buy on the bleeding edge.


About the cheapest (and weakest) form of backup you could do is to keep snapshots around, with some form of occasional longer term backup.

This assumes that snapshots are cheap -- it depends on how they are implemented. Copy on write file systems like NetApp's WAFL and SUN's ZFS have snapshots that are virtually zero cost, in contrast to the O(n) cost of copyout snapshots. Cheap snapshots are really really nice.

Just keeping snapshots around isn't really a backup solution, but I'm not sure any real solution is possible under your constraints without serious hacking.

Also, I'm seriously biased here as a NetApp dev, but you should seriously at least talk to some NetApp salespeople before you conclude they're out of your price range. :-)


A direct- or fibre-attached tape library + NDMP can be a pretty slick solution but if your storage system can't use something like this to write out to tape or if the budget is particularly constrained, you may be in the position of having to use a traditional backup solution to backup the data in the LUN through a backup client on the host attached to the SAN.

In a scenario like this, the SAN-hosted data is treated just like physical disks in the client being backed up.

While NDMP functionality is sometimes included with a storage system (a la NetApp), backup applications may actually charge extra to backup up via NDMP. For example, in our NetBackup environment, the NDMP licenses were far more expensive than the regular OS-client backup licenses.

oops..just refreshed and saw your addition re: not having $$ for tapes. Where are you planning on putting your backups if not on tape or another SAN?

Going all-disk for backup is do-able but it's usually not considered a budget option for any large amount of data. Likewise, backing the data up to the same SAN can mitigate some risks if you're careful (like make sure it's going to completely separate disks) but it doesn't really offer any kind total failure or disaster protection. Same goes for a backup server with lots of disks...some level of protection but if the location where both the SAN and the big-honkin-backup server live suffers a serious outage or disaster, all that data is gone.