How can I determine the cause of an apparent memory leak in my Apache/PHP based web app?
Solution 1:
Tracking down WHAT is causing the problem can be a pain in the ass. The first thing I'd do if I had a problem like that is reduce MaxRequestsPerChild
to an aggresively low number (~100-200) and see if that makes a difference. If it does, then you probably have code that is leaking memory in a loop somewhere and you'll want to run a code audit.
Another thing to look at is Apache's fullstatus, see if you can find out what particular request is causing the memory leak. Get the PIDs on your suspected processes and run an strace on them.
Solution 2:
Friday @ exactly 11pm? Does that correspond to a backup time? Does your system have the I/O available to serve processes and backups at that time? Does you trending software also trend # procs or even apache scoreboard, how about disk I/O?
The first thing I would do would be to calculate how much mem each proc takes, then set a reasonable limit for MaxRequests in apache so that $procmem * $procs cannot exceed available ram. I suspect your instance needs to be rebooted because OOM kicks off a witch hunt that is likely (often) not very fruitful. You need to ensure your box can handle these heavy times by staying within its bounds and not go to swap and certainly not OOM. This is harder if you have cronjobs going, and extremely difficult if said cronjobs unilatterally run without making sure it's safe to run (i.e. the every 5 minute script fails to check if the last 5min one is still running).
Now that you've ensured that even if things go wildy wrong, you won't need to reboot your box, things will start going a lot better for you. You'll be able to login during these heavy times and get a good idea of what's going on using top, dstat, free -m, iostat, etc.
Matt's method may be worth trying, but should only be used as a tool for troubleshooting, I do not recommend keeping it that way because it will make the overall problem much harder to find the next time you're looking for it. That said, it will only really tease out issues with apache/modules and not anything in your code. I think you'll agree the chances are good it's not some sort of memory leak in apache module (assuming you're using a reputable distro).