How to stop Apache from crashing my entire server?

I maintain a Gentoo server with a few services, including Apache. It's fairly low-end (2GB of RAM and a low-end CPU with 2 cores). My problem is that, despite my best efforts, an over-loaded Apache crashes the entire server. In fact, at this point I'm close to being convinced that Linux is a horrible operating system that isn't worth anyone's time looking for stability under load.

Things I tried:

  1. Adjusting oom_adj for the root Apache process (and thus all its children). That had close to no effect. When Apache was overloaded it would bring the system to a grind, as the system paged out everything else before it got to kill anything.
  2. Turning off swap. Didn't help, it would unload memory paged to binaries of processes and other files on /, thus causing the same effect.
  3. Putting it in a memory-limited cgroup (limited to 512 MB of RAM, 1/4th of the total). This "worked", at least in my own stress tests - except the server keeps crashing under load (basically stalling all other processes, inaccessible via SSH, etc.)
  4. Running it with idle I/O priority. This wasn't a very good idea in the end, because it just caused the system load to climb indefinitely (into the thousands) with almost no visible effect - until you tried to access an unbuffered part of the disk. This caused the task to freeze. (So much for good I/O scheduling, eh?)
  5. Limiting the number of concurrent connections to Apache. Setting the number too low caused web sites to become unresponsive due to most slots being occupied with long requests (file downloads).
  6. I tried various Apache MPMs without much success (prefork, event, itk).
  7. Switching from prefork/event+php-cgi+suphp to itk+mod_php. This improved performance, but didn't solve the actual problem.
  8. Switching I/O schedulers (cfq to deadline).

Just to stress this out: I don't care if Apache itself goes down under load, I just want the rest of my system to remain stable. Of course, having Apache recover quickly after a brief period of intensive load would be great to have, but one step at a time.

Right now I am mostly dumbfounded by how can humanity, in this day and age, design an operating system where such a seemingly simple task (don't allow one system component to crash the entire system) seems practically impossible - or at least, very hard to do.

Please don't suggest things like VMs or "BUY MORE RAM".


Some more information gathered with a friend's help: The processes hang when the cgroup oom killer is invoked. Here's the call trace:

[<ffffffff8104b94b>] ? prepare_to_wait+0x70/0x7b
[<ffffffff810a9c73>] mem_cgroup_handle_oom+0xdf/0x180
[<ffffffff810a9559>] ? memcg_oom_wake_function+0x0/0x6d
[<ffffffff810aa041>] __mem_cgroup_try_charge+0x32d/0x478
[<ffffffff810aac67>] mem_cgroup_charge_common+0x48/0x73
[<ffffffff81081c98>] ? __lru_cache_add+0x60/0x62
[<ffffffff810aadc3>] mem_cgroup_newpage_charge+0x3b/0x4a
[<ffffffff8108ec38>] handle_mm_fault+0x305/0x8cf
[<ffffffff813c6276>] ? schedule+0x6ae/0x6fb
[<ffffffff8101f568>] do_page_fault+0x214/0x22b
[<ffffffff813c7e1f>] page_fault+0x1f/0x30

At this point, the apache memory cgroup is practically deadlocked, and burning CPU in syscalls (all with the above call trace). This seems like a problem in the cgroup implementation...


I hate to say it, but you appear to be asking the wrong question.

It's not about stopping Apache from bringing down your server, it's about having your webserver serve more queries per second - enough so that you don't have a problem. A part of the answer to the reframed question is then limiting Apache so that it does not crash at high loads.

For the second part of that, Apache has some limits you can set - MaxClients being an important configuration. This limits how many children it's allowed to run. If you can take load off Apache for long-running processes (large files being downloaded for example), that's another slot in Apache to be able to serve PHP. If the file downloads have to be verified by the PHP layer, they can still do that, and pass back out to a more optimised webserver for the static content, such as with NginX sendfile

Meanwhile, forking Apache every on every single request for the slowest way to run PHP - as a CGI (whatever apache MPM you may be using) - is also having the machine spend large amounts of time not running your code. mod_php is significantly more optimised.

PHP can do huge amounts of traffic when Apache and the PHP layer are appropriately optimised. Yesterday, 11th Dec 2010, for example, the pair of PHP servers that I run did almost 19 Million hits in the 24hr period, and most of that in the 7am-8pm time-period.

There are plenty of other questions here, and articles elsewhere about optimising Apache and PHP, I think you need to read them first, before blaming Linux/Apache & PHP.


When you are dealing with a production apache server, you MUST have an average process size, especially with php, I'll recommend you to:

  • Check your process averages memory consumption
  • Adjust MaxClients to AVERAGE_MEMORY / RAM_DEDICATED_TO_APACHE

Where RAM_DEDICATED_TO_APACHE it must be another estimation of the TOTAL_RAM minus the ram that needs the rest of the machine (and be generous with the database if you are running one in the same machine).

I really recommend you to use Varnish, you can easily run 2 servers on different ports on the save machine, and route the static files to an specialized file (media) server (lighthttpd, nginx) or an apache instance with worker and no extra modules. And of course catch the static content with varnish.

Split the load is important because you will be using the same amount of ram to deliver any static file (which needs less than 1MB) if you don't do it.

If you really need to make sure to never consume all the ram, you can install a new cronjob running each 2 minutes (less or more as you consider) with the following line, adjusting the 50 to any amount of the lowest ram, and keep this number above 30 at least; you'll need some ram to stop the server.

vmstat -S M | tail -n 1 | awk 'BEGIN{ "date" | getline date }{if($4 + $6 < 50){ system("/etc/init.d/httpd stop"); system("/etc/init.d/httpd start"); print "Rebooting apache  on " date >> "/var/log/apache-reboots.log"}}'

This is a very hakish (dirty) way of limit you ram, but it can be very helpful when you are not really sure about your average memory per apache process, and if you see several reboots in you log file ("/var/log/apache-reboots.log"), the you should tune your apache MaxClients, MaxRequestsPerChild, ThreadsPerChild to avoid futures hard-reboots, with the time and tunning, you will have the exact configuration for your server.