is CPU actually occupied during iowait (%wa in top) on Linux / EC2?

On an 8-way Amazon EC2 instance (running Linux 2.6.21) with 8 EBS volumes and a lot of disk traffic, we see high %wa in top (30-40%), and high load average (8-9). My understanding is that processes waiting on I/O from the EBS volumes are counted in the load average (a ps shows several processes in the D state, about as many as the load average).

However, it's not clear what %wa means. Is a CPU actually occupied waiting for a response from the EBS volume, or does the kernel schedule another process on it? I would expect that another process would be scheduled; but then I don't understand why iowait time would be expressed as a percentage of total CPU time (unless the percentages add up to more than 100%).

So long as we don't max out the I/O capacity of the EBS volumes I'm not concerned, but if the CPUs get tied up waiting for I/O I think our machine will run out of CPU capacity before running out of I/O capacity.


Solution 1:

The CPU can and will be used for other processes, provided there is at least one process that is ready to receive CPU time. There's the rub - you can have an I/O-bound system with every process waiting for I/O to complete, and as there is nothing waiting for CPU time, there is no reason to schedule (and utilize) CPU for anything other than the kernel's activities...hence the term, I/O wait.

Try running vmstat 1 and see if there are numbers greater than 0 in the "b" column (2nd column) on a regular basis. If so, you're probably I/O-bound. Seeing it occasionally isn't a big deal, seeing it all the time with numbers in the 2-3 range is tolerable but not desireable, and seeing more than 5+ means you're probably way too busy (although that depends on how much I/O your system can accomodate, so it can be more or less, depending). The "b" means "processes blocked", as in, "the number of processes scheduled to run, but were blocked, pending the completion of an I/O".


Follow-up:

There is a known bug with heavy I/O and the newer schedulers on the 2.6 series of kernels. Try changing your scheduler to see if it has an impact.