linux OOM-kill why?
Solution 1:
Well, I think your min_free_kbytes is really high. I have a 16GB machine and my min is 67584kB.
Note that vmware's ram counts as cache, because of the mmap-ed vmem
Thats not always true. Only if the mmapped() file is opened in MAP_SHARED is that true. Else dirty pages are swap-backed. Which is the case for you it seems. If you add up the reported usage of that process given at the bottom of your output and convert it into pages (4k). It equals the RSS reported in the task dump for that process.
rss:74020kB, file-rss:3099352kB
74020 + 3099352 = 3173372
3173372 / 4 = 793343
is equal to ..
[19635] 501 19635 1693624 793343 1 0 0 vmware-vmx
As for why you OOM-kill. Well, thats a little bit more tricky.
When you reach min
the kernel wants to recover memory up to high
watermark bytes. The kernel thus has a check; if the amount of memory available to reclaim from the file cache will not be sufficient to put you back into the high
watermark of that zone, it wont bother freeing file cache and go straight to reclaiming from anonymous memory.
We never reclaim from active
. So -
if (file_inactive > zone_high - free_mem) then
reclaim (zone_high - free_mem) file inactive pages
else
reclaim from anonymous pool
In you're case that is 55220 is not greater than 228684-152456 (76428)
.
The reason this is an OOM-Kill and not swapping is because when you breach the min
watermark the kernel goes into a direct_reclaim
mode. In this mode, doing IO to free memory cannot be accomplished because it can cause a deadlock.
You're host would have been swapping at the time, but you're host has been allocating faster than it can swap out.
The best way to fix this would be to reduce your min
watermark to something lower -- or better still get more memory and/or reduce the amount of things you run on the machine.