How does landscape calculate memory usage?
I'm trying to debug an OOM situation in an Ubuntu 12.04 server, and looking at the Memory graphs in Landscape, I noticed that there wasn't any serious memory usage spike. Then I looked at the output of the free
command and I wasn't quite sure how both memory usage results relate to each other.
Here's landscape's output on the server:
$ landscape-sysinfo
System load: 0.0 Processes: 93
Usage of /: 5.6% of 19.48GB Users logged in: 1
Memory usage: 26% IP address for eth0: -
Swap usage: 2%
Then I ran the free
command and I get:
$ free -m
total used free shared buffers cached
Mem: 486 381 105 0 4 165
-/+ buffers/cache: 212 274
Swap: 255 7 248
I can understand the 2% swap usage, but where does the 26% memory usage come from?
Solution 1:
In Landscape
landscape-sysinfo
actually gets its data directly from /proc/meminfo
:
dpb@helo:~$ cat /proc/meminfo |egrep 'MemTotal:|Active:'
MemTotal: 12286760 kB
Active: 3794832 kB
dpb@helo:~$
The calculation of "Memory Usage" in this case would be:
((MemTotal - Active) / MemTotal) * 100
You can see these calculations in:
/usr/share/pyshared/landscape/sysinfo/memory.py
/usr/share/pyshared/landscape/lib/sysstats.py
gets its data directly from /proc/meminfo
:
In free
The free
utility also gets its data directly from /proc/meminfo
:
Mem
- total:
MemTotal
- used:
MemTotal
-MemFree
- free:
MemFree
- buffers:
Buffers
- cached:
Cached
Buffers/cache
- used:
MemTotal
-MemFree
-Buffers
-Cached
- free:
MemFree
+Buffers
+Cached
Swap
- total:
SwapTotal
- used:
SwapTotal
-SwapFree
- free:
SwapFree
Total
- total:
MemTotal
+SwapTotal
- used:
MemTotal
-MemFree
+SwapTotal
-SwapFree
- free:
MemFree
+SwapFree
Corrected cached -- lzap
Solution 2:
Those graphs don't reflect every single memory allocation/freeing event, but samples from /proc/meminfo
(exactly as dpb described) at given intervals. A slightly speculative explanation for why it's not showing in the graph would be that it simply occurred between to points when memory usage was sampled.
I suspect that what has happened here is that some process acquired lots of memory in a hurry and OOM killer disposed of it before a sample could be made. That would be a fairly extreme circumstance, and also one that would mean the whole machine was running slowly as it was swapping heavily. This loading on the machine would reduce the likelihood of the system having available time to sample memory usage during that window and report it back to the Landscape server.