I have an apache webserver running many VirtualHosts.

Recently it has been bogging down and becoming unresponsive, and I'm wondering how I can determine which VirtualHosts are causing most of the issue. We have had occasions in the past where a bug in the code of an individual site has taken down the whole server. My goal is to be able to diagnose these instances quickly.

I am monitoring the server with munin and notice that the number of apache processes, memory usage, and load tend to be very high during the periods in question. Problem is, these statistics are for the whole webserver, not for individual VirtualHosts.

I have written a script to parse the weblogs for traffic per VirtualHost, but it is appearing that that is not enough. I probably need to determine how many apache processes each VirtualHost is responsible for, or how long they hold each process open - or perhaps how much memory usage each is responsible for.

Where can I find this information? I don't mind writing a script to track this data, but I don't know exactly where to extract it from in the first place.


Solution 1:

I appreciate that it doesn't always suit to have mod_status available and on all of the time, but it and apachetop are the best ways to diagnose these problems. However there are many ways to skin a cat.

This trick is useful in a number of circumstances and isn't just Apache specific. It does depend on a number of factors however, and you need to know what it's doing to know it's limitations.

for pid in `pgrep -u www-data`; do find /proc/${pid}/cwd -printf "%l\n" ; done

Let's break it down:

  • pgrep -u www-data gives you the list of pids running under user www-data. That's the default on Debian / Ubuntu, change to suit your own system (RedHat based systems tend to use httpd, for example, as the user). For systems without pgrep, you can use ps axuwww | grep user | awk '{print $2}'
  • the *for; do; ... done * loop means we loop over every entry running the command(s) within the do part of the loop.
  • find /proc/${pid}/cwd -printf "%l\n" simply searches /proc for each of those PIDs and spits out the current working directory for that process. Apache will chdir() to the VirtualHost by default when serving files from that VirtualHost. /proc/PID/cwd is a symbolic link to the directory that apache process is running in. the printf "%l\n" prints the endpoint to that link. See find(1) for more info on that.

There are two major caveats to that trick:

1) If something running under the same context as the Apache process does a chdir()'s outside of the VirtualHost directory, you'd be hard pushed to find that out.

e.g. a PHP script running under mod_php (a CGI will be different as Apache fork's a separate process, but I'm presuming CGI's aren't a problem or you'd be able to track them easier).

2) If you have Apache instances which are very very quickly serving pages (e.g. a small static HTML page). This normally isn't a problem, but it may be possible. If you're getting a lot of "No such file or directory" errors, this is basically a manifestation of it. I would expect some, but not the majority unless they fit this particular case. Basically this is because the Apache processes you've scanned with ps have already exited by the time you've checked /proc. Obviously this means they are serving pages very very quickly.

Regarding memory bound Apache processes, I use ps_mem.py to calculate memory usage on my webservers. If you've got large Apache (in terms of resident memory size) processes and they are exiting quickly, that is roughly the equivalent of asking a big fat guy to keep running 100m sprints. If your webserver isn't a shared one, those "No such file or directory" errors are normally good candidates to move some content onto a smaller lightweight webserver (e.g. nginx / lighttpd) or start heavily caching content (e.g. varnish / squid).

Solution 2:

I think you want apachetop, or else mod_status (with ExtendedStatus On). I'm yet to have a performance problem in Apache that wasn't lit up by mod_status, and apachetop looks like a neat tool (that has some annoying limitations in log layout).