Hardware or Software RAID and Filesystem Type?

I'm looking to build a pretty simple RAID-5 or RAID-6 array and I'm looking into my options. The machine will be a 64bit Ubuntu Linux Server and will essentially be used as a backup/file server. I'll need to store pretty large files, so a FAT-32 filesystem won't exactly do the trick :)

I've used ext3 and ext4 in the past, but this has largely been for desktop environments. I'm very intrigued by the btrfs project, and would like to use it if possible. zfs looks sweet, but I read a while back that it doesn't play nicely with Linux. What filesystem should I use?

All of this is above the level of hardware/software RAID configuration. What should I use for RAID? btrfs only supports RAID-0, RAID-1, and RAID-10 configurations and I'm specifically interested in RAID-5 and RAID-6. If I use hardware RAID, does the array appear to the operating system as a single hard disk? If so, then this would essentially mitigate the RAID support issues with btrfs. However, is there a way for me to know if and when a drive fails from the operating system level? How does one restore an array in the event of a failure?

EDIT

As requested, the size I'm looking at is at least 6TB. With RAID-5, I can accomplish this with 4 2TB drives. With RAID-6, I believe I'm looking at 5 2TB drives for the same amount of storage. I don't think I'll need to be able to expand the array, but if I did, it would be at least in a few years' time.


Solution 1:

Filesystems :

Seeing as you say you want it to be pretty simple, my suggestion is that it would probably be best to stay away from file-systems that are relatively new and still developing features like btrfs. I'd recommend one of the filesystems that are more mature and stable, such as ext4. Some that you may wish to check out :

Ext4 - Probably the most popular Linux file system. It's file limits are far more than you'd likely need, and it has a whole bunch of performnce improvements over Ext2 and 3 systems. It can only be grown offline.
XFS - been around for a long time and the second line on it's Wikipedia page is that it is particularly proficient at large files. It also supports online growing for if you add another hard drive later.
ReiserFS and possibly it's newer revision Reiser4 - also been around for a long time, and offers many of the same features, however future development may be somewhat up in the air after it's developing company's owner was convicted of murder(!). It is currently still actively developed and supported by the Namesys company though and most likely this company will be sold to someone else soon.

Also check out the wikipedia page comparing filesystems.

Hardware vs Software :

If you have the money, I'd say go with hardware RAID every time. When I say hardware, I mean a dedicated RAID card that supports RAID5/6. RAID that comes "built in" to the motherboard, is not real hardware RAID (except for some special server motherboards). If the budget is tight though, and if the CPU on the server the RAID array is going to be installed on isn't doing anything else anyway, then you may be able to achieve almost equivalent performance from a software RAID implementation. Also hardware RAID is more likely to better support hot-plugging than a standard PC motherboard (although in both cases you'd usually also want a SATA drive backplane if you want to hotplug devices).

Both software RAID in Linux, and Hardware RAID present one large "drive" to the filesystem. Hardware RAID implementations are usually completely transparent to Linux as all of it's configuration is usually done in it's BIOS, accessed as the PC starts, but most Hardware RAID cards will also have Linux tools you can use whilst it is online.

Solution 2:

See this article: http://augmentedtrader.wordpress.com/2012/05/13/10-things-raid/. Here are some snippets from that post.

Software RAID has advanced significantly in the last few years (as of 2012). Hardware RAID still has the three key vulnerabilities it has always had: First, it is expensive. Second, if your RAID card fails, your RAID volume fails; it is a single point of failure. Third, if your RAID card fails, you must find an exact replacement for that card to recover your data.

On the other hand, software RAID costs nothing, and if your controller card or motherboard fail, you can just move your disks to another machine and set up the appropriate software to read them.

Yes, hardware RAID can be faster than software RAID, but that gap is closing, and the flexibility and reliabilty offered by software RAID outweighs that single advantage. The only case where hardware RAID is the right choice is when absolute speed is the only priority, and you’re willing to take risks with your data.