Linux OOM disk I/O. Also: swap, what is it good for?
I'm having problems with the OOM killer on one of my Linux (2.6.37) installs. The computer has 4GB of memory which I sometimes utilize fully. In those cases, I expect the OOM handler to come in and do its job by killing a process or two. Instead of doing this, or perhaps while attempting to do it, the system locks up, doing disk I/O like there is no tomorrow. Here's the thing: I DON'T have any swap enabled. For some reason my swapless system is still locking up with massive amounts of disk I/O, even though the appropriate course of action is to just kill a process or two. Thoughts?
The whole issue makes me wonder whether Linux requires swap in some way which I am not aware of. An explanation of whether this is the case and why would be greatly appreciated. I am familiar with the ideas of swap on a conceptual level (i.e. virtual memory, paging, overcommit), but I wonder if there is any implementation detail that I may have missed.
The real question is, why are you running with no swap? Especially if you are seeing (serious) performance issues related to running out of RAM? You know not having swap can actually make your system slower, right?
The obvious solution is to add some swap space, and not have your system crap out on you. Considering how cheap disk space is, I can't think of any common situations1 where you should ever build a system without swap.
As to answering your question, I don't remember all of the low-level details on why swap is important even on systems where you aren't going to exhaust the memory, but there have been arguments on the Linux Kernel mailing list about whether it's reasonable to run systems without swap (and there haven't been a lot of conclusive answers). The general consensus is typically to always have swap, and adjust the swapiness as needed.
Also, I think you're misunderstanding some important caveats regarding the Linux OOM killer. First of all, relying on it to handle your Out of Memory issues is a Very Bad Idea (tm). It can be very indiscriminate about what it kills, and it is entirely possible that you will be left with an unstable or even unusable system. Yes, it attempts to kill recent processes that are eating lots of memory (a minor safeguard to try to catch a run away process), but there's no guarantees. I've seen it kill ssh, kill Xen processes (on a Xen virtual host server, causing VMs to crash), and in one case it killed NFS.
As for the IO. . . I don't know for sure what would be causing it. Perhaps a filesystem or disk related process got killed? Perhaps a process has some sort of "cache to disk" functionality built in when it can't allocate enough memory?
Another note, if this is a desktop, swap is required for Suspend to Disk. If it's a server, relying on OOM is never a good idea, as it compromises stability for, well, no good reason at all.
[1] Embedded systems are about the only obvious exception, and they aren't particularly common (and if you're dealing with embedded systems, you're already going to be aware of the requirements).
I think AndreasM has hit it on the head (the reason for the disk going all thrashy.) Executables are demand paged -- so in normal operation you will have nearly all of your executables nad libraries sitting in good ol' physical RAM. But when RAM runs low, but not low enough for the out-of-memory killer to be run, these pages are evicted from RAM. So you end up with a situation where pages are evicted -- at first, no problem, because they are evicted least-recently-used first and it kicks out pages you aren't using anyway. But then, it kicks out the ones you are using, just to have to page them right back in moments later. Thrash city.
Basically, if something used just a bit more RAM, you probably would have the OOM killer kick in but you weren't there yet. As a few have said, OOM killer is indiscriminate, it's really more of a last resort to avoid a kernel panic than something you should consider to use in normal operation. If you have some custom setup, I'd consider writing up some daemon to monitor free memory, and kill using the policy of your choosing when it approaches full.