One large RAID 10 vs several smaller arrays?

Solution 1:

Knowing how to setup your storage is all about measuring and budgeting the IOPS and bandwidth. (I'm being simplistic here because the size of the percentage mix of read/write, average IOs, RAID stripe size, and cache hit percentages matter greatly. If you can get those numbers you can make your calculations even more accurate.)

There's a really nice little IO calculator here that I frequently use when planning out storage. wmarow's storage directory is also nice for getting some fairly contemporary disk performance numbers.

If I dedicate a pair spindles to a particular task, such as transaction logs, and they're not even breaking a sweat with the workload...why not just put that workload onto a larger RAID 10?

Remember that putting sequential IO onto a spindle with random IO makes that sequential IO random. Your transaction log disks may look like they're not breaking a sweat because you're seeing sequential IO operations. Sequential reads and writes to a RAID-1 volume will be quite fast, so if you're basing "not breaking a sweat" on disk queue length, for example, you're not getting the whole story.

Measure or calculate the maximum possible random IOPS for the intended destination volume, take a baseline of the current workload on that volume, and then decide if you have enough headroom to put those transaction log IOPS into the remaining random IO in the destination volume. Also, be sure to budget the space necessary for the workload (obviously). If you're so inclined, build in a percentage of additional "headroom" in your IO workload / space allocation.

Continue with this methodology for all of the other workloads that you want to put into the destination RAID-10 volume. If you run out of random IOPS then you're piling too much into the volume-- add more disks or put some of the workload on dedicated volumes. If you run out of space, add more disks.

Solution 2:

I've read more articles lately saying that RAID 5 is a bad way to go for reliability. I believe it more after having a disk go bad in a RAID 5 array, replaced the drive, and it couldn't recover because a second disk had a latent unrecoverable read error that forced us to reformat and restore from backup.

As drives are getting larger, the odds of having an undetected unrecoverable read error on these huge disks is also increasing, and with RAID 5 the reliability simply isn't cutting it anymore. If you're not mirroring, apparently advice is to go with RAID 10.

Unless you're doing something that is very sensitive to speed issues I wouldn't worry about breaking things into small arrays or anything like that. Put it into a RAID 10 and see if your performance is up to snuff. With proper caching and memory in the system it should be fine. Numbers for benchmarks will never quite be what you will actually get because it depends on your actual usage, actual load, disk drive performance, controller caching, disk caching, fragmentation, etc.

The best rule of thumb is to simplify the configuration as much as possible because even if you sink a day or a week in sweating over this setup in a year or two you're going to have to troubleshoot some issue on this server that will leave you wondering why you set it up the way you did if you added any unnecessary complexity to the configuration.

Solution 3:

How predictable do you want the latencies for your applications to be?

Say, for instance, you have one app that is really latency sensitive, and it shares a filesystem with your financial database. At fiscal closing, all of a sudden you get calls about the latency sensitive app timing out. How you going to figure that out?

On the other hand, I'm a huge fan of simplifying everything, so I'd verify that you don't have any exceptional requirements, and consolidate and simplify your configs if your requirements don't fall into any weird special cases.