iostat reports significantly different '%util' and 'await' for two identical disks in mdadm RAID1
I have a server running CentOS 6 with two Crucial M500 SSDs configured in mdadm RAID1. This server is also virtualized with Xen.
Recently, I started seeing iowait
percentages creep up in the top -c
stats of our production VM. I decided to investigate and ran iostat on the dom0 so I could inspect activity on the physical disks (e.g., /dev/sda and /dev/sdb). This is the command I used: iostat -d -x 3 3
Here's an example of the output I received (scroll to the right for %util
numbers):
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.33 0.00 38.67 0.00 337.33 8.72 0.09 2.22 0.00 2.22 1.90 7.33
sdb 0.00 0.33 0.00 38.67 0.00 338.00 8.74 1.08 27.27 0.00 27.27 23.96 92.63
md2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 0.00 1.00 0.00 8.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md127 0.00 0.00 0.00 29.33 0.00 312.00 10.64 0.00 0.00 0.00 0.00 0.00 0.00
drbd5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
drbd3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
drbd4 0.00 0.00 0.00 8.67 0.00 77.33 8.92 2.03 230.96 0.00 230.96 26.12 22.63
dm-0 0.00 0.00 0.00 29.67 0.00 317.33 10.70 5.11 171.56 0.00 171.56 23.91 70.93
dm-1 0.00 0.00 0.00 8.67 0.00 77.33 8.92 2.03 230.96 0.00 230.96 26.12 22.63
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-6 0.00 0.00 0.00 20.00 0.00 240.00 12.00 3.03 151.55 0.00 151.55 31.33 62.67
dm-7 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-9 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-10 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-11 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
To my alarm, I noticed that there was a significant difference between /dev/sda
and /dev/sdb
in await
(2ms vs 27ms) and %util
(7% vs 92%). These drives are mirrors of one another and are the same Crucial M500 SSD so I don't understand how this could be. There is no activity on /dev/sda
that should not also occur on /dev/sdb
.
I've been regularly checked the SMART values for both of these disks and I've noticed that the Percent_Lifetime_Used
for /dev/sda
indicates 66% used while /dev/sdb
reports a non-sensical value (454% used). I hadn't been too concerned up until this point because the Reallocated_Event_Count
has remained relatively low for both drives and hasn't changed quickly.
SMART values for /dev/sda
SMART values for /dev/sdb
Could there be a hardware issue with our /dev/sdb
disk? Any other possible explanations?
I eventually discovered that this system was not being TRIMed properly and was also partitioned with insufficient overprovisioning (even though the Crucial M500 has 7% level 2 overprovisioning built-in). The combination of the two led to a severe case of write amplification.
Furthermore, this system houses a database with very high write activity leading to a very high number of small random writes. This sort of IO activity has a very poor outcome with write amplification.
I'm still not 100% certain why /dev/sda
was performing better than /dev/sdb
in iostat -- perhaps it was something akin to the silicon lottery where /dev/sda
was marginally superior to /dev/sdb
so /dev/sdb
bottlenecked first.
For us, the two major takeaways are:
- Overprovision your SSDs at 20% (taking into account your SSD may already have 0%, 7% or 28% level 2 overprovisioning).
- Run TRIM on a weekly basis.