Apache crashing with memory/cpu overload when google crawler visits site
I have a site with low traffic, less than 500 hits a day. It has a 6G of memory and it is way underutilized, on average 5% is in use. But as soon as googlebot establishes a connection to my webserver/apache, the memory and cpu usage spikes in seconds and the server becomes inaccessible - website, ssh and all other services.
When I do lsof for port 80, here is what i see before the site crashes in seconds.
lsof -i:80 mywebsite:http->crawl-66-249-71-200.googlebot.com:43567 (ESTABLISHED)
Google Bot is set to a slow crawl rate.
Apache config is:
ServerLimit 256
MaxClients 150
MaxRequestsPerChild 100
KeepAlive Off
KeepAliveTimeout 5
MaxKeepAliveRequests 100
The error log shows:
Cannot allocate memory: couldn't create child process: /opt/suphp/sbin/suphp
Solution 1:
My work actively blocks Googlebot and other crawlers on servers when the load jumps; I certainly don't agree with it, and in my opinion, it's a sign of something far worse with the server in general when we have to block it, though we are hosting thousands of many different websites; you, on the other hand, seem to have your own server.
What this leads me to believe, as Rilindo has guessed, is there's something wrong with your configuration. The sample configuration you gave has at least one item that sticks out like a sore thumb:
MaxRequestsPerChild 100
Are you aware that this causes Apache to rapidly kill child processes and create new ones? The default for this is 10000, in most cases. I would start by setting it to 10000 and see where that gets you.
I also see that you're using suphp; unless you have a lot of different users on your system where security is a concern, I recommend using mod_php instead. mod_php is an Apache module that allows Apache to process PHP, rather than having a separate PHP executable handling the work. This allows memory and CPU time to be shared and threaded through Apache (assuming you're using a threaded MPM, such as worker or event), which means overall reduced load.
If using mod_php isn't an option due to security concerns, then I recommend switching to mod_fcgid; it's pretty much a drop-in replacement for suphp, but much faster.