How would you backup 15TB of data to multiple drives?

I have about 15TB of data across several RAID6 arrays, and it's been growing at a rate of about 10%. I want a cheap solution to back up this data at regular intervals and bring offsite. So I purchased five 3TB drives, hoping to just back them up using a eSATA drive dock. Ideally, I wanted to load up 3TBs at a time, and keep popping in drives until I was done.

However, the filesystems I'm backing up have deep and complex folder structures, so that would rule out doing a folder by folder backup especially as I plan on doing this every few weeks. Incremental backups would be ideal, and I just keep adding drives as needed. However, I wasn't able to find an affordable solution to this (Roxio Retrospect is too pricey).

Does anybody have any suggestions?

Thanks


Solution 1:

Assuming you're using linux, you could probably create a single drive volume with greyhole then load them up, and let it take care of the complexity. However this wouldn't work unless you could load up the whole set of drives at once.

There's also applications that split folders of files files into known sizes - dirsplit comes to mind. I have no idea how they'd work with multiple swappable drives tho.

Solution 2:

Don't try to roll your own backups - there are great software packages out there that have already solved this problem and solved it well. I'm a big fan of Amanda and Bacula.

You should be looking into a proper backup solution; if you can buy a tape drive or library, do so if you actually care about the data you're backing up.

If you get a single drive, go with Amanda. If you get a library, go with Bacula.

If you insist on sticking with your eSATA hard drive backup process, I'd suggest using Amanda and using the hard drives as 'tapes' from Amanda's point of view.

Solution 3:

With that much data, you should really look at Nexenta, The URL is for the free community edition that will handle up to 18 TB but they do commercial ones that handle much, much more. The nice thing about Nexenta is that you do not need RAID controllers, therefore they are cheaper, and you can use any collection of drives, i.e. they don't need to be a matched set.

But the main benefit of Nexenta is the deduplication and compression in the ZFS filesystem. Depending on what you are storing, it may take quit a bit less raw drive space, which also saves you money. Then you either take the whole system offsite, or you build it with external SATA cables in the first place and just unplug the drives. ZFS knows which drive is which so you do not need to keep track of the cables that they were plugged into. No need to buy a case for all the drives, just make up a nice plastic storage box with some conductive foam dividers for moving the drives.

And if you are not sure how this will all work, get a couple of USB hubs, and a dozen thumbdrives, and set up Nexenta with that. Then you can see in practice how the ZFS RAID works, and how it recognizes the drives by content, not by connection.