Why higher await time for DM multipath device than underlying device?

Solution 1:

As user the-wabbit suggests, there is request-merging going on. You can see that in the column avgrq-sz, the average request size - which shows a significant increase.

Now 'await' is the time spent in the queue plus the time spent servicing those requests. If a small request, let's call it 'x', is merged with a couple of other requests (y and z, issued after x), then x will

  • wait in the queue to be merged with y
  • wait in the queue tu be merged with z
  • wait for (x,y,z) to be completed

This will obviously have a negative impact on the await statistic, mostly because of the way await is calculated, without actually signifying a problem in itself.

Now let's take a look at /dev/sdb (dev8-16). Did you know that you are not using that path? You have two priority groups in your multipath config, one is

status=enabled

and on is

status=active

You probably have

path_grouping_policy    failover

in your configuration (which is the default).

If you want to prevent the IO errors in case both paths are down, you could try:

        features        "1 queue_if_no_path"
in your multipath.conf

Now the real question remains, why do both paths go down?