Why does ec2 monitoring show 100% cpu and top only 20%?
I am running an python script on an ec2 instance that inserts rows in an database on another instance. In ec2's monitoring I saw a 100% cpu utilization, whereas top only shows 20% for the python process. What is missing from top? Network overhead?
Solution 1:
The data exposed by top
is often insufficient or misleading in virtualized environments like Amazon EC2 and the reported percentage depends on your instance type and the underlying processor core utilization (which usually doesn't match the virtualized hardware you are presented with from the hypervisor), amongst other things - what you are seeing is most likely caused by respective CPU steal time as exposed in most related Unix/Linux monitoring tools nowadays - see e.g. columns %steal or st in sar
or top
:
st -- Steal Time
The amount of CPU 'stolen' from this virtual machine by the hypervisor for other tasks (such as running another virtual machine).
The blog post EC2 monitoring: the case of stolen CPU provides a nice exploration and illustration of this topic:
When the top command displays 40% CPU busy but CloudWatch says the server is maxed out at 100% — which side do you take? The answer is simple (CloudWatch is correct, top is not) [...]
Please note that this hypervisor metric seems to be (easily) accessible on Unix/Linux systems only, but doesn't seem to be observable on Windows (yet), see my question Is there a Windows equivalent of Unix 'CPU steal time'? for more regarding this problem.