Determining which process is causing heavy disk I/O?

I've seen this question: How to identify heavy write to disk?

And I've used dstat and atop before...but they don't seem to pinpoint what process is causing disk I/O. For example, from dstat:

dstat -ta --top-bio
----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----most-expensive----
     time     |usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw |  block i/o process
14-12 16:16:25| 22   3  49  26   0   0|2324k    0 |  17k 6144B|   0     0 |1324     0 |
14-12 16:16:26| 24   3  30  43   0   0|4960k 8192B|1498B 4322B|   0     0 |1494     0 |wget          0  4096B
14-12 16:16:27| 25   4  38  33   0   0|4612k  548k|5011B   27k|   0     0 |1582     0 |kjournald     0    24k
14-12 16:16:28| 23   3  42  32   0   0|5072k    0 |  24k 4368B|   0     0 |1495     0 |

Notice how high dsk/total is -- between 2 and 5 MB/sec. But then look at 'most expensive' column -- it's only a couple bytes here, a couple KB there, and sometimes even nothing. It's the same sort of thing with 'atop'. Shows high overall disk usage, but low usage from individual processes. I'm running CentOS 5, kernel 2.6.18-53.

Do I need a newer kernel version? Maybe some system config setting somewhere? The 'atop' homepage recommends installing some kernel patches, but I'd rather not go through the hassle of configuring & compiling my own kernel.


Solution 1:

iotop (link) for starter ;) I haven't seen you posting an output of it.

1: I have experienced almost the same situation with a logging filesystem and atime - however with more writes.

Try to remount with noatime and turn off filesystem logging (later for testing only) in order to see if it's filesystem based and as said, iotop if it's process based.

2: I guess this partition is not part of a just-rebuilding raid array, is it?

3: If you are having a lot of very small files (a lot smaller than the actual block device blocksize and/or the filesystem blocksize), and you are reading those small files, you end up reading entire blocks from the system, and most of those blocks will be read for nothing.

4: If nothing helps above, you can always get the list of files accessed by executing

echo 1 > /proc/sys/vm/block_dump

please note that it degrades system performance by a lot. Instructions are available in my previous post here