Strace poll, trying to diagnose bottleneck

Solution 1:

Take a step back first -- you are getting very low level before finding out where the problem generally is. A simple way to do this is to see if static HTTP pages are slow from Apache -- if they are not then maybe the DB is slow. The next step I would take is to see how long DB queries are taking.

Also if this is all on the same system you can look at system resources will tools such as top, iotop, and iostat.


Regarding the poll system call:

I would try adding the -c switch to the strace of Apache to see what your timings are on each syscall if you narrow it down to apache before focusing on a system call. For example the top of my output from a healthy HAProxy is:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 38.79    0.001152           0      6089           epoll_wait

The poll system call is used to wait for for an available an event on a file descriptor. My first guess since the timeout is so low that this is the normal functioning of Apache. It might be that Apache is running out of file descriptors since a file descriptor is needed for each network socket if this actually a problem. You could look at Apache manual section on File Descriptors if you get to this point. But according to DerkK "it'd be failing at open() or socket()" which makes a lot more sense.

So really this is just some detail on what very likely isn't your actual problem -- again take several steps back.