Nothing seems to account for 5GB of memory that is missing under Linux

I have Kubuntu 16.04 and am using ZFS for a big data partition (RAIDZ1)

I am missing 5GB of ram and can't find out where it did go. And NO IT'S NOT CACHE

According to all tools I could come up with I currently have the following stats:

Actively used:  9458.3 MB
Inacively used: 2544.5 MB
Mapped for IO:  2433.0 MB
Buffers:         242.6 MB
Slab:           6669.2 MB
Page Tables:      91.0 MB
Cached:         3758.1 MB
Dirty:             0.1 MB
Writeback:         0.0 MB
Free:           1856.5 MB
------------------------------
Total:         27053.3 MB
Total Memory:  32133.5 MB
------------------------------
Missing ???:    5080.2 MB

So 5 GB!! of RAM are not accounted for. Where did they go?

Tool outputs

nmon:

           RAM     High      Low     Swap    Page Size=4 KB                                                        │
│ Total MB     32133.5     -0.0     -0.0   8195.5                                                                        │
│ Free  MB      1856.5     -0.0     -0.0   8195.5                                                                        │
│ Free Percent     5.8%   100.0%   100.0%   100.0%                                                                       │
│             MB                  MB                  MB                                                                 │
│                      Cached=  3758.1     Active=  9458.3                                                               │
│ Buffers=   242.6 Swapcached=     0.0  Inactive =  2544.5                                                               │
│ Dirty  =     0.1 Writeback =     0.0  Mapped   =  2433.0                                                               │
│ Slab   =  6669.2 Commit_AS = 16647.1 PageTables=    91.3                                                               │
│ Large (Huge) Page Stats ───────────────────────────────────────────────────────────────────────────────────────────────│
│  There are no Huge Pages                                                                                               │
│  - see /proc/meminfo                                                                                                   │
│                                                                                                                        │
│ Virtual-Memory ────────────────────────────────────────────────────────────────────────────────────────────────────────│
│nr_dirty    =       37 pgpgin      =       0                High Normal    DMA                                          │
│nr_writeback=        0 pgpgout     =       0  alloc            0    343      0                                          │
│nr_unstable =        0 pgpswpin    =       0  refill           0      0      0                                          │
│nr_table_pgs=    23384 pgpswpout   =       0  steal            0      0      0                                          │
│nr_mapped   =   622856 pgfree      =     305  scan_kswapd      0      0      0                                          │
│nr_slab     =       -1 pgactivate  =       0  scan_direct      0      0      0                                          │
│                       pgdeactivate=       0                                                                            │
│allocstall  =        0 pgfault     =      74  kswapd_steal     =      0                                                 │
│pageoutrun  =        0 pgmajfault  =       0  kswapd_inodesteal=      0                                                 │
│slabs_scanned=       0 pgrotated   =       0  pginodesteal     =      0

cat /proc/memstat

MemTotal:       32904740 kB
MemFree:         1759548 kB
MemAvailable:    5372548 kB
Buffers:          249072 kB
Cached:          3852616 kB
SwapCached:            0 kB
Active:          9819328 kB
Inactive:        2609860 kB
Active(anon):    8334856 kB
Inactive(anon):   412176 kB
Active(file):    1484472 kB
Inactive(file):  2197684 kB
Unevictable:        7932 kB
Mlocked:            7932 kB
SwapTotal:       8392188 kB
SwapFree:        8392188 kB
Dirty:               224 kB
Writeback:             0 kB
AnonPages:       8335424 kB
Mapped:          2497092 kB
Shmem:            416004 kB
Slab:            6829716 kB
SReclaimable:     333652 kB
SUnreclaim:      6496064 kB
KernelStack:       26496 kB
PageTables:        95636 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    24844556 kB
Committed_AS:   17097748 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB                                                                                               
VmallocChunk:          0 kB                                                                                               
HardwareCorrupted:     0 kB                                                                                               
AnonHugePages:   4306944 kB                                                                                               
CmaTotal:              0 kB                                                                                               
CmaFree:               0 kB                                                                                               
HugePages_Total:       0                                                                                                  
HugePages_Free:        0                                                                                                  
HugePages_Rsvd:        0                                                                                                  
HugePages_Surp:        0                                                                                                  
Hugepagesize:       2048 kB                                                                                               
DirectMap4k:     9029708 kB                                                                                               
DirectMap2M:    22384640 kB                                                                                               
DirectMap1G:     2097152 kB 

free -h

              total        used        free      shared  buff/cache   available                                           
Mem:            31G         19G        1.8G        406M         10G        5.2G                                           
Swap:          8.0G          0B        8.0G 

atop

MEM | tot    31.4G  | free    1.8G |               | cache   3.7G |  dirty   0.1M | buff  243.5M  |              |  slab    6.5G |               |              |               |              |               |
SWP | tot     8.0G  | free    8.0G |               |              |               |               |              |               |               |              |               | vmcom  16.3G |  vmlim  23.7G |

Additional info/story

I have had only 16GB memory and saw that the system started to swap sligtly. Not constantly though. It seems like the memory usage grew until it started to use a few megabyte of swap and then stopped growing. This was the first time I learned about "slab" and found out that a lot of my memory went there due to ZFS.

Great, no problem, so I installed another 16GB of memory, this should do the job, right? But instead I saw the same behaviour again. Memory grew until it started to use the swap slightly. This time however I could not find out where 5GB are going to. Under Windows I am used to be able to find the purpose of each page of memory with the right tools (https://www.youtube.com/watch?v=AjTl53I_qzY), but here I am a bit lost. 5Gb are just gone. Is this a kernel memory leak?

For now I set swappiness to 15, this seems to prevent using swap "for now" but 5 GB are still gone.

Update 1: After running for 2 weeks now, this effect is nearly gone.

│ Memory Stats ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────                                                                                                                       │
│                RAM     High      Low     Swap    Page Size=4 KB                                                                                                                                                                                                             │
│ Total MB     32133.5     -0.0     -0.0   8195.5                                                                                                                                                                                                                             │
│ Free  MB      1871.9     -0.0     -0.0   8186.4                                                                                                                                                                                                                             │
│ Free Percent     5.8%   100.0%   100.0%    99.9%                                                                                                                                                                                                                            │
│             MB                  MB                  MB                                                                                                                                                                                                                      │
│                      Cached=  6275.9     Active= 10217.1                                                                                                                                                                                                                    │
│ Buffers=   363.4 Swapcached=     1.3  Inactive =  3134.5                                                                                                                                                                                                                    │
│ Dirty  =     0.2 Writeback =     0.0  Mapped   =  3911.0                                                                                                                                                                                                                    │
│ Slab   =  6159.5 Commit_AS = 13696.9 PageTables=    95.0

Actively used: 10217.1 MB
Inacively used: 3134.5 MB
Mapped for IO:  3911.0 MB
Buffers:         363.4 MB
Slab:           6159.5 MB
Page Tables:      95.0 MB
Cached:         6275.9 MB
Dirty:             0.2 MB
Writeback:         0.0 MB
Free:           1871.9 MB
------------------------------
Total:         32028,3 MB
Total Memory:  32133.5 MB
------------------------------
Missing ???:     105.2 MB

Really unused (free) memory stayed fairly constant. But cache and mapped went up. This really seems like there is some hidden cache that is being slowly depleted but not shown in the stats.

Silvio Massina mentioned it might be the ARC. Here is the output of

cat /proc/spl/kstat/zfs/arcstats

6 1 0x01 91 4368 56409879056 315868863969705
name                            type data
hits                            4    15276585
misses                          4    1100779
demand_data_hits                4    10451405
demand_data_misses              4    57248
demand_metadata_hits            4    3886139
demand_metadata_misses          4    876962
prefetch_data_hits              4    133147
prefetch_data_misses            4    71927
prefetch_metadata_hits          4    805894
prefetch_metadata_misses        4    94642
mru_hits                        4    2334376
mru_ghost_hits                  4    9870
mfu_hits                        4    12003233
mfu_ghost_hits                  4    34745
deleted                         4    89041
mutex_miss                      4    10
evict_skip                      4    239
evict_not_enough                4    2
evict_l2_cached                 4    0
evict_l2_eligible               4    14139960320
evict_l2_ineligible             4    3255242752
evict_l2_skip                   4    0
hash_elements                   4    554684
hash_elements_max               4    568778
hash_collisions                 4    424785
hash_chains                     4    33824
hash_chain_max                  4    5
p                               4    3482926902
c                               4    11779217520
c_min                           4    33554432
c_max                           4    16847226880
size                            4    11717468560
hdr_size                        4    226991968
data_size                       4    8517812736
metadata_size                   4    1503463424
other_size                      4    1469200432
anon_size                       4    11872256
anon_evictable_data             4    0
anon_evictable_metadata         4    0
mru_size                        4    2606045184
mru_evictable_data              4    1947246592
mru_evictable_metadata          4    39131136
mru_ghost_size                  4    7123638784
mru_ghost_evictable_data        4    6026175488
mru_ghost_evictable_metadata    4    1097463296
mfu_size                        4    7403358720
mfu_evictable_data              4    6570566144
mfu_evictable_metadata          4    378443776
mfu_ghost_size                  4    598224896
mfu_ghost_evictable_data        4    589954048
mfu_ghost_evictable_metadata    4    8270848
l2_hits                         4    0
l2_misses                       4    0
l2_feeds                        4    0
l2_rw_clash                     4    0
l2_read_bytes                   4    0
l2_write_bytes                  4    0
l2_writes_sent                  4    0
l2_writes_done                  4    0
l2_writes_error                 4    0
l2_writes_lock_retry            4    0
l2_evict_lock_retry             4    0
l2_evict_reading                4    0
l2_evict_l1cached               4    0
l2_free_on_write                4    0
l2_cdata_free_on_write          4    0
l2_abort_lowmem                 4    0
l2_cksum_bad                    4    0
l2_io_error                     4    0
l2_size                         4    0
l2_asize                        4    0
l2_hdr_size                     4    0
l2_compress_successes           4    0
l2_compress_zeros               4    0
l2_compress_failures            4    0
memory_throttle_count           4    0
duplicate_buffers               4    0
duplicate_buffers_size          4    0
duplicate_reads                 4    0
memory_direct_count             4    445
memory_indirect_count           4    3009
arc_no_grow                     4    0
arc_tempreserve                 4    0
arc_loaned_bytes                4    0
arc_prune                       4    0
arc_meta_used                   4    3199655824
arc_meta_limit                  4    12635420160
arc_meta_max                    4    4183324304
arc_meta_min                    4    16777216
arc_need_free                   4    0
arc_sys_free                    4    526475264

I can't make much of it. If I understand it correctly size should be the size of the ARC. But it would be 11GB. which does not fit in anywhere in my memory stats.

Update 2:

I just ended Baloo (Search indexer under Kubuntu) and now things are extreme again.

 Memory Stats ──────────────────────────────────────────────────────────────────────────────────────────────────────────│
│                RAM     High      Low     Swap    Page Size=4 KB                                                        │
│ Total MB     32133.5     -0.0     -0.0   8195.5                                                                        │
│ Free  MB      1004.4     -0.0     -0.0   8132.1                                                                        │
│ Free Percent     3.1%   100.0%   100.0%    99.2%                                                                       │
│             MB                  MB                  MB                                                                 │
│                      Cached=  3458.4     Active=  7348.9                                                               │
│ Buffers=   177.5 Swapcached=     3.7  Inactive =  2662.9                                                               │
│ Dirty  =     2.9 Writeback =     0.0  Mapped   =   937.9                                                               │
│ Slab   =  5526.4 Commit_AS = 13320.6 PageTables=    89.2                                                               │
│───────────────────────────────────────────────────────────

Now I am missing 6.5GB!! of ram.

The Baloo proces used 3.5Gb before I ended it.

Is KDE using some nasty tricks here?

UPDATE 3

It gets worse. I tested on a different Linux PC and there it was clear that Inactive used mixes into cache. So I am missing even more memory.


Solution & afterthoughts

As the accepted answer by Silvio Massina pointed out, it was indeed the ARC of the ZFS.

It now claimed to have allocated 14GB of memory.

cat /proc/spl/kstat/zfs/arcstats | grep -E "^size"

size                            4    14953847480

So I used stress to grep me 10 GB of memory:

stress --vm-bytes 10000000000 --vm-keep -m 1

And voila, the ARC cache value did go down accordingly

cat /proc/spl/kstat/zfs/arcstats | grep -E "^size"

size                            4    4147553616

And now after killing stress I have 11GB free memory which is slowly eaten up by the ARC again over time. (Which is totally fine)

So ARC is like cache but is shown under used memory instead and is not listed with the processes either, only in its own info file. This is odd for me as I think the OS should always be the master of memory and should let one know who uses it. I did make a more precise followup question about this on AskUbuntu if one of you is interested.

PS: Please don't forget to upvote me, if this helped you as well. The bounty was not cheap G


Solution 1:

ZFS uses ARC (Adaptive Replacement Cache).

From the Gentoo wiki:

The manner in which Linux accounts for memory used by ARC differs from memory used by the page cache. Specifically, memory used by ARC is included under "used" rather than "cached" in the output used by the free program. This in no way prevents the memory from being released when the system is low on memory. However, it can give the impression that ARC (and by extension ZFS) will use all of system memory if given the opportunity.

The amount of memory used for the ARC depends on the memory available on the system and can be controlled setting the zfs_arc_min and zfs_arc_max parameters.

You can set the value of zfs_arc_max at runtime with:

echo 2147483648 >> /sys/module/zfs/parameters/zfs_arc_max

Or at boot with:

echo "options zfs zfs_arc_max=2147483648" >> /etc/modprobe.d/zfs.conf

Values are expressed in bytes.