Ubuntu Server 12.04 CPU Load
Solution 1:
Looks OK to me.
You've got 12 cores.. across 2x 6-core CPUs. So at 100% performance, your load average should be 12.
Load average is funny. I do not think it means what you think it means.
Load average is really an indication of how many processes are running at any one time, averaged over 1, 5 and 15 minute windows.
Looks to me like you're a little overcommitted, but not drastically.
Perhaps use http://mysqltuner.pl/mysqltuner.pl to get some idea of how your mysqld settings equate to real usage amounts.
The next logical step of course is to separate MySQL and Apache onto different boxes. I'm not sure you're at that level yet, because you've still got a pantload of RAM free for MySQL to suck it up into. You might find some benefit from making query caches and key buffers bigger, and probably have a deeper look at MySQL's slow query log, and see if you can optimise the tables at all .
There's loads of information about how to read load averages, and really it's more sensible to divide the load average number by the number of cores, so you've got some idea of how utilised the server actually is.
I can see now you've got 33% iowait. I suspect.. that you've got a fairly write-heavy database, and this is causing tables to be locked when you're writing, meaning that concurrent writes cannot happen.
Having had a sniff at my.cnf, it looks like the max_connections is quite high, but that's not a huge concern, but it does mean that if you're using all of them, you'll need 27GB of RAM to allow this. Which is loads, but not a huge concern, again.
Consider turning on PHP APC Opcode caching.
** Edit **
Having seen the query log now, I'm inclined to think that there's a few things that might benefit the server.
- PHP APC Opcode caching (makes apache more efficient generally)
- Convert all tables to InnoDB unless you've got a really good reason. IF that reason is fulltext searching, find a better way to do it, and move to InnoDB.
- Buy another server, and make it a dedicated DB host. Fit it with SAS disks, and separate it into partitions so that logging and data are on seperate spindles (or rather, RAID arrays).
Without a much deeper look into what the hell is going on, it's difficult to actually say.
Might be worth a trial run with NewRelic for PHP. It's free for a month, and does tend to give good insight into bad code smells.
Alternatively, I am available for consultancy ;)
Solution 2:
There is one striking point int your top output and that is number of processes in D state. A good chunk of apache2 and even kjournald is even in D state. D state processes are known to increse CPU load.
Most usually, a process goes into D state when it waits on IO. After getting the IO, it again comes to R state or S state from D. The next thing you can do to perform live debugging is to check for how much time these D state processes are running. If for quite some time, a problem.
Anyway, your problem, if it is high load average lies in IOwait as 33.1% is the value of iowait reported by top. %usr and %sys are not much, so we can safely ignore that processes are going haywire or CPU is under performing or there is a bottleneck with memory. The problem is iowait, apparently. I mostly work with RHEL, so I am not 100% sure of ubuntu and if any inbuilt tools are there.
What I do mostly is to collect, several iterations of top, vmstat for some time, iostat for some time (with proper switches that show device break up), one iteration of ps and ps -xv and check them. Often, the first level of debugging can be done from this much. Next, I might collect some oprofile, perf outputs depending on RHEL version but that is another story.
Irrespective, please check all the debugging commands at the same time to get a more finely granular view.