Huge load on Centos, many apache processes
I'm experiencing a huge load on my server at the moment and I can't figure out why. When I use the 'top' command, there's hundreds of apache processes with the command "aux", but I can't find anything online that tells me what it means. The load is flapping between 50-150, which is a good 50-150 more than it usually is.
Netstat returns hundreds and hundreds of rows like this:
tcp 0 0 xxx.xxx.xxx.xxx:45216 61.155.202.205:80 CLOSE_WAIT 28863/aux
Almost all from 61.155.xxx.xxx (not sure if this is relevant information, but trying to give as much as possible).
The OS is CentOS: release 5.7 Final We just run LAMP stack on it with about 30 websites that don't get much load (or so I thought). I've checked the logs for all of the vHosts but none seem to be getting many/any requests (not nearly enough to cause this trouble). I'm not sure if there are other logs I should be checking?
It started a couple of days ago; no changes made on the server as far as I'm aware.
Does anyone have any ideas for how I can track down what's causing the huge spike in load? Are there other commands/logs that I've missed that might be able to help me track down what the problem is?
That's not a connection from 61.155.xxx.xxx. That's a connection to a webserver on 61.155.202.205.
It looks very much like your webserver is making HTTP requests to other webservers on ADSL connections in China. Try a tcpdump -n -A -s0 host 61.155.202.205
to see what kind of data you are collecting. I suspect it's malicious.
If it is malicious, refer to My server's been hacked! EMERGENCY.
The "many Apache processes" is most likely caused by the high load rather than causing the high load. Even at a load average of 50 I would expect to start seeing HTTP requests taking multiple seconds. At 150 it would be worse.
To possibly help anyone who hits the same thing, it was a Trojan (Trojan.Perl.Shellbot-2
) that was causing the problem. Between the answer/comments here and on my other question at 398715, we did the following:
- Installed and ran
chkrootkit
, but nothing found - Installed and ran
clamav
, which tracked down the name of the virus and where it was located - Searched for others with the same problem and found this post
- Followed instructions on removing and cleaning up after the virus
- Added
apache
to thecron.deny
file and restartedcrond
This is only part of the solution; the server still needs to be rebuilt after we track down where the vulnerability is, but this was a good start, and we got the server back up and running.
If anyone can think of anything I've missed, something I'm doing wrong or could do better, please let me know.