Why higher await time for DM multipath device than underlying device?
Solution 1:
As user the-wabbit suggests, there is request-merging going on. You can see that in the column avgrq-sz, the average request size - which shows a significant increase.
Now 'await' is the time spent in the queue plus the time spent servicing those requests. If a small request, let's call it 'x', is merged with a couple of other requests (y and z, issued after x), then x will
- wait in the queue to be merged with y
- wait in the queue tu be merged with z
- wait for (x,y,z) to be completed
This will obviously have a negative impact on the await statistic, mostly because of the way await is calculated, without actually signifying a problem in itself.
Now let's take a look at /dev/sdb (dev8-16). Did you know that you are not using that path? You have two priority groups in your multipath config, one is
status=enabled
and on is
status=active
You probably have
path_grouping_policy failover
in your configuration (which is the default).
If you want to prevent the IO errors in case both paths are down, you could try:
features "1 queue_if_no_path"in your multipath.conf
Now the real question remains, why do both paths go down?