OOM killer not working?
For what I understand, when the system is close to have no free memory, the kernel should start to kill processes to regain some memory. But in my system this does not happen at all.
Suppose a simple script that just allocates much more memory than the available in the system (an array with millions of strings, for example). If I run a script like this (as a normal user), it just gets all the memory until the system completely freezes (only SysRQ REISUB works).
The weird part here is that when the computer freezes, the hard drive led turns on and stays that way until the computer is rebooted, either if I have a swap partition mounted or not!
So my questions are:
- Is this behavior normal? It's odd that an application executed as a normal user can just crash the system this way...
- Is there any way I can make Ubuntu just kill automatically those applications when they get too much (or the most) memory?
Additional information
- Ubuntu 12.04.3
- Kernel 3.5.0-44
-
RAM: ~3.7GB from 4GB (shared with graphics card). *
$ tail -n+1 /proc/sys/vm/overcommit_* ==> /proc/sys/vm/overcommit_memory <== 0 ==> /proc/sys/vm/overcommit_ratio <== 50 $ cat /proc/swaps Filename Type Size Used Priority /dev/dm-1 partition 4194300 344696 -1
From the official /proc/sys/vm/*
documentation:
oom_kill_allocating_task
This enables or disables killing the OOM-triggering task in out-of-memory situations.
If this is set to zero, the OOM killer will scan through the entire tasklist and select a task based on heuristics to kill. This normally selects a rogue memory-hogging task that frees up a large amount of memory when killed.
If this is set to non-zero, the OOM killer simply kills the task that triggered the out-of-memory condition. This avoids the expensive tasklist scan.
If panic_on_oom is selected, it takes precedence over whatever value is used in oom_kill_allocating_task.
The default value is 0.
In order to summarize, when setting oom_kill_allocating_task
to 1
, instead of scanning your system looking for processes to kill, which is an expensive and slow task, the kernel will just kill the process that caused the system to get out of memory.
From my own experiences, when a OOM is triggered, the kernel has no more "strength" enough left to do such scan, making the system totally unusable.
Also, it would be more obvious just killing the task that caused the problem, so I fail to understand why it is set to 0
by default.
For testing, you can just write to the proper pseudo-file in /proc/sys/vm/
, which will be undone on the next reboot:
echo 1 | sudo tee /proc/sys/vm/oom_kill_allocating_task
For a permanent fix, write the following to /etc/sysctl.conf
or to a new file under /etc/sysctl.d/
, with a .conf
extension (/etc/sysctl.d/local.conf
for example):
vm.oom_kill_allocating_task = 1
Update: The bug is fixed.
Teresa's answer is enough to workaround the problem and is good.
Additionally, I've filed a bug report because that is definitely a broken behavior.