Why can't my Linux kernel reclaim its slab memory?
I have a system that suffered from ever-increasing memory usage until it hit the point where it was hitting swap even for mundane things and consequently becoming pretty unresponsive. The culprit appears to have been kernel-allocated memory, but I'm having difficulty figuring out what exactly was going on in the kernel.
How can I tell which kernel threads/modules/whatever are responsible for particular chunks of kernel memory usage?
Here's a graph of the system's memory usage over time:
The slab_unrecl
value, which grows over time, corresponds to the SUnreclaim
field in /proc/meminfo
.
When I ran slabtop
towards the end of that graph and sorted it by cache size, here's what it showed me:
Active / Total Objects (% used) : 15451251 / 15530002 (99.5%)
Active / Total Slabs (% used) : 399651 / 399651 (100.0%)
Active / Total Caches (% used) : 85 / 113 (75.2%)
Active / Total Size (% used) : 2394126.21K / 2416458.60K (99.1%)
Minimum / Average / Maximum Object : 0.01K / 0.16K / 18.62K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
3646503 3646503 100% 0.38K 173643 21 1389144K kmem_cache
3852288 3851906 99% 0.06K 60192 64 240768K kmalloc-64
3646656 3646656 100% 0.06K 56979 64 227916K kmem_cache_node
1441760 1441675 99% 0.12K 45055 32 180220K kmalloc-128
499136 494535 99% 0.25K 15598 32 124784K kmalloc-256
1066842 1066632 99% 0.09K 25401 42 101604K kmalloc-96
101430 101192 99% 0.19K 4830 21 19320K kmalloc-192
19168 17621 91% 1.00K 599 32 19168K kmalloc-1024
8386 7470 89% 2.00K 525 16 16800K kmalloc-2048
15000 9815 65% 1.05K 500 30 16000K ext4_inode_cache
66024 45955 69% 0.19K 3144 21 12576K dentry
369536 369536 100% 0.03K 2887 128 11548K kmalloc-32
18441 16586 89% 0.58K 683 27 10928K inode_cache
44331 42665 96% 0.19K 2111 21 8444K cred_jar
12208 7529 61% 0.57K 436 28 6976K radix_tree_node
627 580 92% 9.12K 209 3 6688K task_struct
6720 6328 94% 0.65K 280 24 4480K proc_inode_cache
36006 36006 100% 0.12K 1059 34 4236K kernfs_node_cache
266752 266752 100% 0.02K 1042 256 4168K kmalloc-16
134640 133960 99% 0.02K 792 170 3168K fsnotify_mark_connector
1568 1461 93% 2.00K 98 16 3136K mm_struct
1245 1165 93% 2.06K 83 15 2656K sighand_cache
Conclusions:
- The kernel's slab allocator is using about 2.3 GB of RAM
- Almost all of that is unreclaimable
- About 1.3 GB of it is occupied by the
kmem_cache
cache - Another 0.5 GB belongs to the various-sized kmalloc caches
This is where I've hit a wall. I haven't figured out how to look inside those caches and see why they've gotten so large (or why their memory is unreclaimable). How can I go further in my investigations?
Solution 1:
perf kmem record --slab
will capture profiling data and perf kmem stat --slab --caller
will subtotal by kernel symbol.
That doesn't explain why your workload does this however. Add in perf record
and look at the report to see what is calling into the kernel.
kprobes can trace specific kernel stacks leading to a type of allocation. I'm not super familiar with this myself, but try reading the examples accompanying eBPF scripts like slabratetop.
Also vary things up a bit on your host. Add RAM to be sure you are not under sizing it. Try newer kernel versions or a different distribution.