How to check disk I/O utilization per process?

If you are lucky enough to catch the next peak utilization period, you can study per-process I/O stats interactively, using iotop.


You can use pidstat to print cumulative io statistics per process every 20 seconds with this command:

# pidstat -dl 20

Each row will have follwing columns:

  • PID - process ID
  • kB_rd/s - Number of kilobytes the task has caused to be read from disk per second.
  • kB_wr/s - Number of kilobytes the task has caused, or shall cause to be written to disk per second.
  • kB_ccwr/s - Number of kilobytes whose writing to disk has been cancelled by the task. This may occur when the task truncates some dirty pagecache. In this case, some IO which another task has been accounted for will not be happening.
  • Command - The command name of the task.

Output looks like this:

05:57:12 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:32 PM       202      0.00      2.40      0.00  jbd2/sda1-8
05:57:32 PM      3000      0.00      0.20      0.00  kdeinit4: plasma-desktop [kdeinit]              

05:57:32 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:57:52 PM       202      0.00      0.80      0.00  jbd2/sda1-8
05:57:52 PM       411      0.00      1.20      0.00  jbd2/sda3-8
05:57:52 PM      2791      0.00     37.80      1.00  kdeinit4: kdeinit4 Running...                   
05:57:52 PM      5156      0.00      0.80      0.00  /usr/lib64/chromium/chromium --password-store=kwallet --enable-threaded-compositing 
05:57:52 PM      8651     98.20      0.00      0.00  bash 

05:57:52 PM       PID   kB_rd/s   kB_wr/s kB_ccwr/s  Command
05:58:12 PM       202      0.00      0.20      0.00  jbd2/sda1-8
05:58:12 PM      3000      0.00      0.80      0.00  kdeinit4: plasma-desktop [kdeinit]              

Nothing beats ongoing monitoring, you simply cannot get time-sensitive data back after the event...

There are a couple of things you might be able to check to implicate or eliminate however — /proc is your friend.

sort -n -k 10 /proc/diskstats
sort -n -k 11 /proc/diskstats

Fields 10, 11 are accumulated written sectors, and accumulated time (ms) writing. This will show your hot file-system partitions.

cut -d" " -f 1,2,42 /proc/[0-9]*/stat | sort -n -k +3

Those fields are PID, command and cumulative IO-wait ticks. This will show your hot processes, though only if they are still running. (You probably want to ignore your filesystem journalling threads.)

The usefulness of the above depends on uptime, the nature of your long running processes, and how your file systems are used.

Caveats: does not apply to pre-2.6 kernels, check your documentation if unsure.

(Now go and do your future-self a favour, install Munin/Nagios/Cacti/whatever ;-)