Invisible memory leak on Linux - Ubuntu Server (not disk cache/buffers!)

Solution 1:

My conclusion is it is a kernel memory leak somewhere in the Linux kernel, this is why none of the userspace tools are able to show where memory is being leaked. Maybe it is related to this question: https://serverfault.com/questions/670423/linux-memory-usage-higher-than-sum-of-processes

I upgraded the kernel version from 3.13 to 3.19 and it seems the memory leak has stopped! - I will report back if I see a leak again.

It would still be useful to have some easy/easier way to see how much memory is used for different parts of the Linux kernel. It is still a mystery what was causing the leak in 3.13.

Solution 2:

Story

I can reproduce your issue using ZFS on Linux.

Here is a server called node51 with 20GB of RAM. I marked 16GiB of RAM to be allocatable to the ZFS adaptive replacement cache (ARC):

root@node51 [~]# echo 17179869184 > /sys/module/zfs/parameters/zfs_arc_max
root@node51 [~]# grep c_max /proc/spl/kstat/zfs/arcstats
c_max                           4    17179869184

Then, I read a 45GiB file using Pipe Viewer in my ZFS pool zeltik to fill up the ARC:

root@node51 [~]# pv /zeltik/backup-backups/2014.04.11.squashfs > /dev/zero
  45GB 0:01:20 [ 575MB/s] [==================================>] 100%

Now look at the free memory:

root@node51 [~]# free -m
             total       used       free     shared    buffers     cached
Mem:         20013      19810        203          1         51         69
-/+ buffers/cache:      19688        324
Swap:         7557          0       7556

Look!

51MiB in buffers
69MiB in cache
120MiB in both
19688MiB of RAM in use, including buffers and cache
19568MiB of RAM in use, excluding buffers and cache

The Python script that you referenced reports that applications are only using a small amount of RAM:

root@node51 [~]# python ps_mem.py
 Private  +   Shared  =  RAM used       Program

148.0 KiB +  54.0 KiB = 202.0 KiB       acpid
176.0 KiB +  47.0 KiB = 223.0 KiB       swapspace
184.0 KiB +  51.0 KiB = 235.0 KiB       atd
220.0 KiB +  57.0 KiB = 277.0 KiB       rpc.idmapd
304.0 KiB +  62.0 KiB = 366.0 KiB       irqbalance
312.0 KiB +  64.0 KiB = 376.0 KiB       sftp-server
308.0 KiB +  89.0 KiB = 397.0 KiB       rpcbind
300.0 KiB + 104.5 KiB = 404.5 KiB       cron
368.0 KiB +  99.0 KiB = 467.0 KiB       upstart-socket-bridge
560.0 KiB + 180.0 KiB = 740.0 KiB       systemd-logind
724.0 KiB +  93.0 KiB = 817.0 KiB       dbus-daemon
720.0 KiB + 136.0 KiB = 856.0 KiB       systemd-udevd
912.0 KiB + 118.5 KiB =   1.0 MiB       upstart-udev-bridge
920.0 KiB + 180.0 KiB =   1.1 MiB       rpc.statd (2)
  1.0 MiB + 129.5 KiB =   1.1 MiB       screen
  1.1 MiB +  84.5 KiB =   1.2 MiB       upstart-file-bridge
960.0 KiB + 452.0 KiB =   1.4 MiB       getty (6)
  1.6 MiB + 143.0 KiB =   1.7 MiB       init
  5.1 MiB +   1.5 MiB =   6.5 MiB       bash (3)
  5.7 MiB +   5.2 MiB =  10.9 MiB       sshd (8)
 11.7 MiB + 322.0 KiB =  12.0 MiB       glusterd
 27.3 MiB +  99.0 KiB =  27.4 MiB       rsyslogd
 67.4 MiB + 453.0 KiB =  67.8 MiB       glusterfsd (2)
---------------------------------
                        137.4 MiB
=================================

19568MiB - 137.4MiB ≈ 19431MiB of unaccounted RAM

Explanation

The 120MiB of buffers and cache used that you saw in the story above account for the kernel's efficient behavior of caching data sent to or received from an external device.

The first row, labeled Mem, displays physical memory utilization, including the amount of memory allocated to buffers and caches. A buffer, also called buffer memory, is usually defined as a portion of memory that is set aside as a temporary holding place for data that is being sent to or received from an external device, such as a HDD, keyboard, printer or network.

The second line of data, which begins with -/+ buffers/cache, shows the amount of physical memory currently devoted to system buffer cache. This is particularly meaningful with regard to application programs, as all data accessed from files on the system that are performed through the use of read() and write() system calls pass through this cache. This cache can greatly speed up access to data by reducing or eliminating the need to read from or write to the HDD or other disk.

^{_{Source: http://www.linfo.org/free.html}}

Now how do we account for the missing 19431MiB?

In the free -m output above, the 19688MiB "used" in "-/+ buffers/cache" comes from this formula:

(kb_main_used) - (buffers_plus_cached) =
(kb_main_total - kb_main_free) - (kb_main_buffers + kb_main_cached)

kb_main_total:   MemTotal from /proc/meminfo
kb_main_free:    MemFree  from /proc/meminfo
kb_main_buffers: Buffers  from /proc/meminfo
kb_main_cached:  Cached   from /proc/meminfo

^{_{Source: procps/free.c and procps/proc/sysinfo.c}}

(If you do the numbers based on my free -m output, you'll notice that 2MiB aren't accounted for, but that's because of rounding errors introduced by this code: #define S(X) ( ((unsigned long long)(X) << 10) >> shift))

The numbers don't add up in /proc/meminfo, either (I didn't record /proc/meminfo when I ran free -m, but we can see from your question that /proc/meminfo doesn't show where the missing RAM is), so we can conclude from the above that /proc/meminfo doesn't tell the whole story.

In my testing conditions, I know as a control that ZFS on Linux is responsible for the high RAM usage. I told its ARC that it could use up to 16GiB of the server's RAM.

ZFS on Linux isn't a process. It's a kernel module.

From what I've found so far, the RAM usage of a kernel module wouldn't show up using process information tools because the module isn't a process.

Troubleshooting

Unfortunately, I don't know enough about Linux to offer you a way to build a list of how much RAM non-process components (like the kernel and its modules) are using.

At this point, we can speculate, guess, and check.

You provided a dmesg output. Well-designed kernel modules would log some of their details to dmesg.

After looking through dmesg, one item stood out to me: FS-Cache

FS-Cache is part of the cachefiles kernel module and relates to the package cachefilesd on Debian and Red Hat Enterprise Linux.

Perhaps some time ago, you configured FS-Cache on a RAM disk to reduce the impact of network I/O as your server analyzes the video data.

Try disabling any suspicious kernel modules that could be eating up RAM. They can probably be disabled with blacklist in /etc/modprobe.d/, followed by a sudo update-initramfs -u (commands and locations may vary by Linux distribution).

Conclusion

A memory leak is eating up 8MB/hr of your RAM and won't release the RAM, seemingly no matter what you do. I was not able to determine the source of your memory leak based on the information that you provided, nor was I able to offer a way to find that memory leak.

Someone who is more experienced with Linux than I will need to provide input on how we can determine where the "other" RAM usage is going.

I have started a bounty on this question to see if we can get a better answer than "speculate, guess, and check".

Solution 3:

Do you change the Swapiness of your Kernel manualy or disable it?

you can whatch you current swappyness-level with

cat /proc/sys/vm/swappiness

You could try to force your kernel to swap aggressively with

sudo sysctl -w vm.swappiness=100

if this decrease you problems find a good value between 1 and 100, fitting your requirement.