Impact of RAID levels on IOPS [closed]
With regard to IOPS, I have seen several sources on the web that suggest the IOPS of a given number of disks is simply the IOPS of a single disk multiplied by the number of disks.
If my understanding of IOPS is correct (and I'm not at all sure it is), I would have thought the reality would depend on - amongst many other factors - the RAID level. With RAID 1/10, all data is duplicated across at least two disks, reducing contention on a particular disk for some IO patterns. However, in striped RAID levels such as RAID 0/5/6, data is distributed rather than duplicated, meaning consecutive read requests could be for the same spindle, leading to blocking while the previous IO completes. Writes are even more contended.
I should add that I appreciate the reality is much more complex due to various optimisations and other factors. My question is really just driving at whether, at a very basic level, my understanding of what IOPS means is on the right track. It could be that my assertion that IOPS could even be influenced by RAID levels in such a way indicates a basic misunderstanding of the concept.
For HDD, IOPS are generally dominated by disk's access time, which is the sum of seek latency + rotational delay + transfer delay. As these variables strongly depend on the access patterns and have not-obvious interactions with the specific RAID layout (ie: stripe size) and controller (ie: read ahead tuning), any simple reply WILL BE WRONG.
However, lets try to have a ballpark figure. On a first approximation, IOPS guaranteed by a n-disk array should be n-times the IOPS of a single disk. However, both RAID level and data access pattern, by shifting weight between seek/rotational/transfer latency, drammatically changes this first-order approximation.
Lets do some examples, assuming 100 IOPS per single disks (a tipical value for 7200 RPM disks) and 4-disks arrays (except for RAID1, often limited to 2-way only):
- a single disk is 100 IOPS, both reading and writing (note: due to write coalescing, write IOPS are generally higher than read IOPS, but lets ignore that for simplicity)
- RAID0 (4-way striping) has up to 4x the random IOPS and up to 4x the sequential IOPS. The key word here is "up to": due to the nature of striping and data alignement, if the random accessed sectors prevalently reside on a single disk, you will end with much lower IOPS.
- RAID1 (2-way mirroring) is more complex to profile. As different disks can seek on different data, it has up to 2x the random read IOPS but the same 1x (or slightly lower, due to overhead) random write IOPS. If all things align well (ie: large but not 100% sequential reads, a RAID controller using chunks/stripes concept/handling even in mirroring mode, read-ahead working correctly, etc.) sequential reads can sometime be up to 2x the single disk value, while sequential writes remain capped at 1x the single disk (ie: no speedup)
- RAID10 (4-way mirroring) is, performance-wise, at half-way between 4-way RAID0 striping and 2-way mirroring. It has up to 4x the random read IOPS and up to 2x the random write IOPS. For sequential transfers, the RAID1 caveat applies: it sometime has up to 4x the sequential read IOPS, but only 2x the sequential write IOPS. Please note that some RAID10 implementation (namely Linux MDRAID) provide different layouts for RAID10 arrays, with different performance profile.
- RAID5 (striped parity) has up to 4x the random read IOPS, while random write IOPS, depending on a number of factors as how large the write is in respect to the stripe size, the availability of a large stripe cache, the stripe reconstruction algorithm itself (read-reconstruct-write vs read-modify-write), etc, can be anywhere between 0.5x (or lower) and 2x the IOPS of a single disk. Sequential workloads are more predictable, with 3x the IOPS of a single disk (both for reading and writing)
- RAID6 (striped double parity) behaves much like its RAID5 brother, but with lower write performances. It has up to 4x the random read IOPS of a single disk, but its random write performance is even lower than RAID5, with the same absolute values (0.5x - 2x) but with lower real word average. Sequential reads and writes are capped at 2X the IOPS of a single disk.
Let me repeat: the above are simple and almost broken approximations. Anyway, if you want to play with a (severely incomplete) RAID IOPS calculator, give a look here.
Now, go back to the real world. On real world workloads, RAID10 is often the faster and preferred choice, maintaining high performance even in the face of a degraded array. RAID5 and RAID6 should not be used on performance-sensitive workloads, unless they are read-centric or sequential in nature. It's worth noting that serious RAID controllers have big power-loss protected writeback cache mainly to overcome (by heavy stripe caching) the RAID5/6 low random write performance. Never use RAID5/6 with cache-less RAID controllers, unless you really don't care about array's speed.
SSD are different beasts, thought. As they have instrinsically much lower average access time, parity-based RAIDs incour a much lower performance overhead and are much more viable option than on HDDs. However, in a small random-write centric workload, I would use a RAID10 setup, anyway.
It's just a matter of definitions. You can measure IOPS at different levels in the system and you will get different values. For example, suppose you have two mirrored disks and you are writing as fast as you can. The IOPS going to the disks will be twice the number of IOPS a single disk can handle with a similar write load. But the IOPS going into the controller will be equal to the number of IOPS a single disk can handle.
Usually what we care about is how many logical IOPS we can get into the array and we don't particularly care what's happening at the disk level. In that case, you are correct and the IOPS depends on the RAID level, the number of disks, the performance of the individual disks and, in some cases, the specific characteristics of the operations.