setup lowcost image storage server with 24x SSD array to get high IOPS?

I would consider a hybrid solution which could be achieved with OpenSolaris, SolarisExp 11, OpenIndiana, or Nexenta. Hybrid pool would be a lot less costly, and with a few thousand bucks worth of RAM, you will have your 150k+ IOPS with mostly normal spinning disks. At Nexenta we have many, many customers who do just exactly this. ZFS is a robust filesystem, and with enough RAM and/or SSDs for additional Read/Write caching you can have a very robust solution at a relatively low cost. With Nexenta Core, which is community, you get an 18TB at no cost at all. Of course, a new release of OpenIndiana would allow a lot of the same functionality. Add to this snapshots, cloning, replication usinf ZFS send/recv and you can build a SAN that will give any EMC a run for its money at a far lower cost. Lots of SSDs are nice, but there are other options, some not half-bad.


Use RAID6 over RAID10. For mainly read based I/O loads the throughput should be similar when the array is not degraded, you get better redundancy (any two drives can fail at the same time with R6, R10 can not survice if both failed drives are on the same leg (so can only survive four of the six two drive failure combinations in a 4-drive array, I'm not sure off the top of my head how that 4/6 figure scales for larger arrays)), and you get a larger usable array size unless you arrange the drives in 4-drive sub-arrays (see below).

Your space calculation is out, certainly for RAID10. 24*240Gb is 5760Gb with no redundancy (RAID0 or JBOD). With RAID10 you'll get only 2880Gb as there are (ussually) two exact copies of every block. If you use all the drives as one large RAID6 array you will get your 5Tb (5280Gb, two drives worth of parity info spread over the array) but I personally would be more paranoid and create smaller RAID6 arrays and join them with RAID0 or JBOD - that way you have shorter rebuild times when drives are replaced and you can survive more drives failing at once in many cases (two drives per leg can die, rather than two drives out of the total 24, without the array becoming useless). With four drives per leg you get the same amount of space as RAID10. Four 6-drive arrays may be a good compromise (4*4*240=3840Gb usable space) or three 8-drive arrays (3*6*240=4320Gb usable space).

With regard to controllers: these can be a single-point-of-failure for RAID. If the controller dies you lose all the drives attached to it at once. While such failures are pretty rare (random corruption is more common) there is no harm in taking care to reduce the impact should it happen to you. If you use RAID10 make sure that no pair of drives are both on the same controller (which means having at least two). If splitting into 4-drive RAID-6 arrays use four controllers and have one drive or a given array on each. This of course assumes you are using software RAID and simple controllers which might be unlikely (you are spending this much on drives, you may as well get some decent hardware RAID controllers to go with them!).

You should give a thought to a backup solution too if you have not already. RAID will protect your from certain hardware failures but not from many human errors and other potential problems.


Just buy two FusionIO Octal cards and mirror them - far simpler, far faster (might be a bit more expensive however).


Answering your key questions:

  1. RAID 6 vs. RAID 10: You almost certainly do not need to worry about IOPS if you are using SSDs as primary storage.

  2. SLC vs. MLC: There are subtler differences. If you are going to use MLC, I would suggest buying Intel. The Intel 320 series has a SMART counter that you can use to track the wear level percentage and replace the drive before it fails.

However, you may want to look at ZFS on the Nexenta OS (or possibly FreeBSD, unsure of development status) if you want to use SSDs to improve storage performance in a reliable way:

  1. ZFS allows you to build a "RAID-Z2" (somewhat like RAID-6) array of conventional disks that use SSDs as massive read (L2ARC) and write (ZIL) caches, allowing you to get the performance benefits that you're looking for without the cost of an all-Flash array.

  2. Blocks that are accessed often will be read from the SSDs, and blocks that are used less often will still be read from disk. All writes will go to SSD first and be committed to disk when it is convenient for the array.

  3. Because you will need fewer SSDs, you will buy higher-quality devices and you will not have the kind of catastrophic failure that is to be expected if you build a RAID array out of consumer-grade MLC devices from OCZ (or whatever).

  4. Even if you don't use high-quality devices, the consequences are less severe. If you use MLC devices for your ZFS L2ARC and they fail, you still have your data preserved on disk.