24TB RAID 6 configuration
I am in charge of a new website in a niche industry that stores lots of data (10+ TB per client, growing to 2 or 3 clients soon). We are considering ordering about $5000 worth of 3TB drives (10 in a RAID 6 configuration and 10 for backup), which will give us approximately 24 TB of production storage. The data will be written once and remain unmodified for the lifetime of the website, so we only need to do a backup one time.
I understand basic RAID theory, however I am not experienced with it. My question is, does this sound like a good configuration? What potential problems could this setup cause?
Also, what is the best way to do a one-time backup? Have two RAID 6 arrays, one for offsite backup and one for production? Or should I backup the RAID 6 production array to a JBOD?
EDIT: The data server is running Windows 2008 Server x64.
EDIT 2: To reduce rebuild time, what would you think about using two RAID 5's instead of one RAID 6?
I currently support 220 servers up to 96 TB (totalling 2 PB or so), some in clusters of up to 240 TB, that my team built. Here are my advices :
- use a good, reliable hardware RAID controller : possible choices are 3Ware 96xx or 97xx, LSI 92xx, Areca 16xx, Adaptec 5xx5... Of course, with a Battery Backup Unit because power failures occur sometimes.
- use only professional grade drives,coming with 24/24 and 7/7 operation support; don't use cheap desktop drives. You don't want to lose 100,000$ worth of data because you chose to save 20 bucks per drive.
- The biggest the drives, the longer the rebuild. 3 TB will need at least 12 hours in the best case. Use RAID-6 for reliable protection.
- drives do fail. Up to 5% per year; don't even dream of using JBOD, even for backup. This is plain bad advice. Use RAID-6.
- RAID-5 is obsolete, we simply don't use it anymore with drives bigger than 300GB. See this expert post for instance. Did I mentioned you should use RAID-6?
- For only 24 TB, I'd stick to 2 TB drives; there is a 10-15% premium on 3 TB; more spindles will provide better performance, shorter rebuild, and better safety because the drives have been available for quite a long time and are really very reliable.
- You could buy an excellent 3U Supermicro, AIC or equivalent chassis with 16 drives slots, filled with 2TB drives (RAID-6 + hot spare) that would provide exactly 24 TiB of available space and redundant power supplies.
Honestly, I think $5k for the drives is a bit steep... but that's a whole other subject. The setup sounds sound-enough, but in the event of a drive-failure... having a single-volume that is 24tb will take FOREVER to rebuild. (ever tried to read 3tb of data split across 9 other disks?) It would be better to have smaller raid-sets and join them together to form a bigger volume. If a drive fails, it doesn't kill the performance of the entire volume while the whole thing rebuilds... but rather only the performance of the one raid-set.
Also, depending on what your website is run on... (Linux/Windows/OSX/Solaris/???) can also dictate what tools you use and the configuration you use.
What do you mean by a "one-time backup?" If you meant a "one-way archive"... (i.e. new files are written to the backup-server.. but nothing is ever read from it), I highly recommend using rsync in *nix flavored environments (linux/unix/etc...) or if it's IIS (windows) based use something like synctoy or xxcopy. If you need a LIVE copy (0 delay between when a file is written to when it appears on other server) you'll need to provide more information about your environment. Linux & Windows work completely different, and the tools are 100% different. For stuff like that, you'll probably want to look into clustered-file-systems and probably should look more towards a SAN rather than host-based storage.
We generally use RAID5 or 6 for backup disks as it gives the best bang-for-buck once you ignore RAID 0 :-) so I'd go for that rather than JBODs
One thing you might consider is buying your disks in separate batches rather than all 20 at once as if there is a manufacturing defect in a batch, they may fail at similar times.
You also may wish to consider using mirroring rather than conventional backups if the data is only being written once - there are quite a few software and hardware storage systems that allow that to be set up and you may also get the benefit of failover in the event of your primary storage failing.
One option that would fit well with your use-case, especially if your requirements keep growing, is an HSM (Hierarchical Storage Manager). I've installed several HSMs ranging up to 150TB of disk and 4PB of tape.
The idea is that an HSM manages the lifecycle of data to reduce the overall cost of storage. Data is initially stored on disk but almost immediately archived to tape (which is much cheaper per byte). Archive policies can be configured to store multiple copies on tape for extra safety, and most people take a second copy offsite. The migration to and from tape is transparent to the end user - the files still appear in the filesystem.
When the end user requests the file in future, the data is automatically staged back from tape and served to the user. With a tape library, the staging process only adds about a minute to the retrieval time.
One huge benefit of an HSM is the recovery time if your disks fail or if you have filesystem corruption. If you ever have a catastrophic disk or filesystem failure, you can just find some more disk and restore a recent backup of the filesystem metadata (a tiny fraction of the total data volume). At that point, all of the data is available on-demand as per usual.