Load average is greater than the number of EC2 Compute Units

On an EC2 m1.large, with an AVG CPU Utilization graph such as this:

enter image description here

how is is possible that the load average is greater than the number of EC2 Compute Units (4) ?

cat /proc/loadavg
5.78 5.57 5.44 1/188 9388

Solution 1:

Load average is not bound to the maximum of the number of the compute units. The load average is a measurement of the number of processes running or waiting to run. If your load average is higher than the number of compute units, that means that there is a queue of processes waiting to run.

Now, in your case, you show a graph of CPU utilization. All well and good - but what that indicates is that you have processes waiting on something else other than CPU time. This is almost always I/O, and most commonly disk I/O. If you look at top, you will probably see a lot of time spent in the 'wa' state.

top is a good next place to look to figure out why your systems are overloaded.

Solution 2:

Ok, you have a couple of fundamental misunderstandings here.

First, EC2 compute units are not analogous to the number of CPU cores. Rather, they're an abstracted representation of the relative CPU performance available. The m1.large has two CPU cores available.

Second, when the load average number exceeds the number of cores available, generally speaking, this indicates that processes are queuing up, waiting for something - typically either CPU cycles or IO. Stated differently, they're having to wait in line...

In general you don't want your load average to exceed the number of cores available.