Why does my VMWare / Linux host slow down every 6 hours?

Solution 1:

Check vmmemctl memory usage. I had a similar problem, see RedHat Linux: server paging, sum of RES/RSS + buffers + cached < TOTAL. Who is using my memory?

In my case, we had an 8GB RAM server and we couldn't find which process was using. This is our vmmemctl:

cat /proc/vmmemctl

target:              1000894 pages
current:             1000894 pages
rateNoSleepAlloc:      16384 pages/sec
rateSleepAlloc:         2048 pages/sec
rateFree:              16384 pages/sec

timer:                325664
start:                     3 (   0 failed)
guestType:                 3 (   0 failed)
lock:                3623088 (  29 failed)
unlock:               623698 (   0 failed)
target:               325664 (   2 failed)
primNoSleepAlloc:    3620199 (  11 failed)
primCanSleepAlloc:      2900 (   0 failed)
primFree:            2622165
errAlloc:                 28
errFree:                  28

getconf PAGESIZE
4096

So vmmemctl is using 4GBs

It's a pity the vmmemctl doesn't use a standard method to report how much memory it's using, but I think it's because how it's implemented.

The main reference from vmware offers a lot of detail about ballooning. I quote since it's relevant to our original problem ( 'why is this server paging if it has non used memory'? ):

"Typically, the hypervisor inflates the virtual machine balloon when it is under memory pressure. By inflating the balloon, a virtual machine consumes less physical memory on the host, but more physical memory inside the guest. As a result, the hypervisor offloads some of its memory overload to the guest operating system while slightly loading the virtual machine. That is, the hypervisor transfers the memory pressure from the host to the virtual machine. Ballooning induces guest memory pressure. In response, the balloon driver allocates and pins guest physical memory. The guest operating system determines if it needs to page out guest physical memory to satisfy the balloon driver’s allocation requests. If the virtual machine has plenty of free guest physical memory, inflating the balloon will induce no paging and will not impact guest performance. In this case, as illustrated in Figure 6, the balloon driver allocates the free guest physical memory from the guest free list. Hence, guest-level paging is not necessary.

However, if the guest is already under memory pressure, the guest operating system decides which guest physical pages to be paged out to the virtual swap device in order to satisfy the balloon driver’s allocation requests. The genius of ballooning is that it allows the guest operating system to intelligently make the hard decision about which pages to be paged out without the hypervisor’s involvement."

"genius of ballooning" :)

Solution 2:

Run slabtop or parse /proc/slabinfo and look at your kernel slabs; it’s very common for the kernel to cache a lot of directory entries and inodes (dentry_cache, ext3_inode_cache) on a system, especially one with lots of file access like an Apache server that’s sending lots of static content (images, etc.). This is where your “missing” memory is usually.

If this tweaks you out you can adjust vm.cache_pressure in /etc/sysctl.conf to reduce that usage but I highly recommend understanding why first.