Interpreting cryptic kernel "page allocation failure" messages

Thinking out loud here but have you considered increasing the vm.min_free_kbytes value using sysctl?

something like:

sysctl vm.min_free_kbytes=16384 

(ps - not 100% sure what its suppose to be on centos, more likely to be found under /proc/sys/vm/min_free_kbytes)


I've been seeing a lot of those as well ... especially on my mirrorserver running apache. On that server changing SLAB allocator to SLUB helped to mitigate the issue altogether.

On another machine with a large MTU interface, I'm still getting allocation failures in similar path, but this time order 5. Haven't found solution for that one yet.

Another thing that partially helps, or rather helps reducing the frequency a little is doing frequent memory compaction (echo 1 > /proc/sys/vm/compact_memory run every minute from cron).

Another thing worth looking at is how your application works with memory - ie. how allocates and frees it. If there are frequent allocations and deallocations it may be worth trying to use some kind of memory pool.

The last but not least thing that worth trying is enabling or disabling (transparent) hugepages.


The issue here was out-of-date VMware guest drivers (vmware-tools) and a newer OS under load. This is something that gets revised as ESXi updated are released. Out-the-box point releases of VMWare are showing this issue. Updated versions are not.

Of course, there's the question of how to cleanly update your VMware installation...