Is it possible to make the OOM killer intervent earlier?
I also struggled with that issue. I just want my system to stay responsive, no matter what, and I prefer losing processes to waiting a few minutes. There seems to be no way to achieve this using the kernel oom killer.
However, in the user space, we can do whatever we want. So I wrote the Early OOM Daemon ( https://github.com/rfjakob/earlyoom ) that will kill the largest process (by RSS) once the available RAM goes below 10%.
Without earlyoom, it has been easy to lock up my machine (8GB RAM) by starting http://www.unrealengine.com/html5/ a few times. Now, the guilty browser tabs get killed before things get out of hand.
The default policy of the kernel is to allow applications to keep allocating virtual memory as long as there is free physical memory. The physical memory isn't actually used until the applications touch the virtual memory they allocated, so an application can allocate much more memory than the system has, then start touching it later, causing the kernel to run out of memory, and trigger the out of memory (OOM) killer. Before the hogging process is killed though, it has caused the disk cache to be emptied, which makes the system slow to respond for a while until the cache refills.
You can change the default policy to disallow memory overcommit by writing a value of 2 to /proc/sys/vm/overcommit_memory
. The default value of /proc/sys/vm/overcommit_ratio
is 50, so the kernel will not allow applications to allocate more than 50% of ram+swap. If you have no swap, then the kernel will not allow applications to allocate more than 50% of your ram, leaving the other 50% free for the cache. That may be a bit excessive, so you may want to increase this value to say, 85% or so, so applications can allocate up to 85% of your ram, leaving 15% for the cache.
For me setting vm.admin_reserve_kbytes=262144 does exactly this thing. OOM killer intervents before system goes completely unresponsive.
The other answers have good automatic solutions, but I find it can be helpful to also enable the SysRq
key for when things get out of hand. With the SysRq
key, you'd be manually messaging the kernel, and you can do things like a safe reboot (with SysRQ + REISUB
) even if userspace has completely frozen.
To allow the kernel to listen to requests, set kernel.sysrq = 1
, or enable just the functions you're likely to use with a bitmask (documented here). For example kernel.sysrq = 244
will enable all the combos needed for the safe reboot above as well as manual invocation of the OOM killer with SysRq + F
.