How to test oom-killer from command line

The OOM Killer or Out Of Memory Killer is a process that the linux kernel employs when the system is critically low on memory. ... This maximises the use of system memory by ensuring that the memory that is allocated to processes is being actively used.

This self-answered question asks:

  • How to test oom-killer from the command line?

A quicker method than the 1/2 hour it takes in the self-answer would be accepted.


Solution 1:

The key to triggering the OOM killer quickly is to avoid getting bogged down by disk accesses. So:

  1. Avoid swapping, unless your goal is specifically to test the behavior of OOM when swap is used. You can disable swap before the test, then re-enable it afterwards. swapon -s tells you what swaps are currently enabled. sudo swapoff -a disables all swaps; sudo swapon -a is usually sufficient to reenable them.

  2. Avoid interspersing memory accesses with non-swap disk accesses. That globbing-based method eventually uses up your available memory (given enough entries in your filesystem), but the reason it needs so much memory is to store information that it obtains by accessing your filesystem. Even with an SSD, it's likely that much of the time is spent reading from disk, even if swap is turned off. If your goal is specifically to test OOM behavior for memory accesses that are interspersed with disk accesses, that method is reasonable, perhaps even ideal. Otherwise, you can achieve your goal much faster.

Once you've disabled swap, any method that seldom reads from a physical disk should be quite fast. This includes the tail /dev/zero (found by falstaff, commented above by Doug Smythies). Although it reads from the character device /dev/zero, that "device" just generates null bytes (i.e., bytes of all zeros) and doesn't involve any physical disk access once the device node has been opened. That method works because tail looks for trailing lines in its input, but a stream of zeros contains no newline character, so it never gets any lines to discard.

If you're looking for a one-liner in an interpreted language that allocates and populates the memory algorithmically, you're in luck. In just about any general-purpose interpreted language, it's easy to allocate lots of memory and write to it without otherwise using it. Here's a Perl one-liner that seems to be about as fast as tail /dev/zero (though I haven't benchmarked it extensively):

perl -wE 'my @xs; for (1..2**20) { push @xs, q{a} x 2**20 }; say scalar @xs;'

With swap turned off on an old machine with 4 GiB of RAM, both that and tail /dev/zero took about ten seconds each time I ran them. Both should still work fine on newer machines with much more RAM than that. You can make that perl command much shorter, if your goal is brevity.

That Perl one-liner repeatedly generates (q{a} x 2**20) separate moderately long strings--about a million characters each--and keeps them all around by storing them in an array (@xs). You can adjust the numbers for testing. If you don't use all available memory, the one-liner outputs the total number of strings created. Assuming the OOM killer does kill perl--with the exact command shown above and no resource quotas to get in the way, I believe in practice it always will--then your shell should show you Killed. Then, as in any OOM situation, dmesg has the details.

Although I like that method, it does illustrate something useful about writing, compiling, and using a C program--like the one in Doug Smythies's answer. Allocating memory and accessing the memory don't feel like separate things in high-level interpreted languages, but in C you can notice and, if you choose, investigate those details.


Finally, you should always check that the OOM killer is actually what killed your program. One way to check is to inspect dmesg. Contrary to popular belief, it is actually possible for an attempt to allocate memory to fail fast, even on Linux. It's easy to make this happen with huge allocations that will obviously fail... but even those can happen unexpectedly. And seemingly reasonable allocations may fail fast. For example, on my test machine, perl -wE 'say length q{a} x 3_100_000_000;' succeeds, and perl -wE 'say length q{a} x 3_200_000_000;' prints:

Out of memory!
panic: fold_constants JMPENV_PUSH returned 2 at -e line 1.

Neither triggered the OOM killer. Speaking more generally:

  • If your program precomputes how much memory is needed and asks for it in a single allocation, the allocation may succeed (and if it does, the OOM killer may or may not kill the program when enough of the memory is used), or the allocation may simply fail.
  • Expanding an array to enormous length by adding many, many elements to it often triggers the OOM killer in actual practice, but making it do that reliably in testing is surprisingly tricky. The way this is almost always done--because it is the most efficient way to do it--is to make each new buffer with a capacity x times the capacity of the old buffer. Common values for x include 1.5 and 2 (and the technique is often called "table doubling"). This sometimes bridges the gap between how much memory can actually be allocated and used and how much the kernel knows is too much to even bother pretending to hand out.
  • Memory allocations can fail for reasons that have little to do with the kernel or how much memory is actually available, and that doesn't trigger the OOM killer either. In particular, a program may fail fast on an allocation of any size after successfully performing a very large number of tiny allocations. This failure happens in the bookkeeping that is carried out by the program itself--usually through a library facility like malloc(). I suspect this is what happened to me today when, during testing with bash arrays (which are actually implemented as doubly linked lists), bash quit with an error message saying an allocation of 9 bytes failed.

The OOM killer is much easier to trigger accidentally than to trigger intentionally.

In attempting to deliberately trigger the OOM killer, one way around these problems is to start by requesting too much memory, and go gradually smaller, as Doug Smythies's C program does. Another way is to allocate a whole bunch of moderately sized chunks of memory, which is what the Perl one-liner shown above does: none of the millionish-character strings (plus a bit of additional memory usage behind the scenes) is particularly taxing, but taken together, all the one-megabyte purchases add up.

Solution 2:

This answer uses a C program to allocate as much memory as possible, then gradually actually uses it, resulting in "Killed" from the OOM protection.

/*****************************************************************************
*
* bla.c 2019.11.11 Smythies
*       attempt to invoke OOM by asking for a rediculous amount of memory
*       see: https://askubuntu.com/questions/1188024/how-to-test-oom-killer-from-command-line
*       still do it slowly, in chunks, so it can be monitored.
*       However simplify the original testm.c, for this example.
*
* testm.cpp 2013.01.06 Smythies
*           added a couple more sleeps, in attempts to observe stuff on linux.
*
* testm.cpp 2010.12.14 Smythies
*           attempt to compile on Ubuntu Linux.
*
* testm.cpp 2009:03:18 Smythies
*           This is not the first edit, but I am just adding the history
*           header.
*           How much memory can this one program ask for and sucessfully get?
*           Done in two calls, to more accurately simulate the program I
*           and wondering about.
*           This edit is a simple change to print the total.
*           the sleep calls have changed (again) for MS C version 2008.
*           Now they are more like they used to be (getting annoying).
*                                                                     Smythies
*****************************************************************************/

#include <stdio.h>
#include <stdlib.h>

#define CR 13

int main(){
   char *fptr;
   long i, k;

   i = 50000000000L;

   do{
      if(( fptr = (char *)malloc(i)) == NULL){
         i = i - 1000;
      }
   }
   while (( fptr == NULL) && (i > 0));

   sleep(15);  /* for time to observe */
   for(k = 0; k < i; k++){   /* so that the memory really gets allocated and not just reserved */
      fptr[k] = (char) (k & 255);
   } /* endfor */
   sleep(60);  /* O.K. now you have 1 minute */
   free(fptr); /* clean up, if we get here */
   return(0);
}

The result:

doug@s15:~/c$ ./bla
Killed
doug@s15:~/c$ journalctl -xe | grep oom
Nov 11 16:08:24 s15 kernel: mysqld invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Nov 11 16:08:25 s15 kernel:  oom_kill_process+0xeb/0x140
Nov 11 16:08:27 s15 kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Nov 11 16:08:27 s15 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user/doug/0,task=bla,pid=24349,uid=1000
Nov 11 16:08:27 s15 kernel: Out of memory: Killed process 24349 (bla) total-vm:32638768kB, anon-rss:15430324kB, file-rss:952kB, shmem-rss:0kB, UID:1000 pgtables:61218816kB oom_score_adj:0
Nov 11 16:08:27 s15 kernel: oom_reaper: reaped process 24349 (bla), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

It still takes awhile to run, but on the order of minutes only.
The use of mlock in the C program might help, but I didn't try it.

My test computer is a server, so I use watch -d free -m to monitor progress.

Readers: Messing with OOM is somewhat dangerous. If you read all these answers and comments, you will note some collateral damage and inconsistencies. We can not control when other tasks might ask for a bit more memory, which could well be at just the wrong time. Proceed with caution, and recommend reboot of computer after these type of tests.