High Server Load cannot figure out why [closed]

Solution 1:

It's pretty clear that your disk has reached it's limit. Generally, the %wa (iowait) should be very low ( <1% for websites in general ) and you want your util% (from iostat -x) to be as low as possible (0 is possible).

You can use iotop to find out what process is causing all the disk usage.

If it turns out to be mysql, you should turn on log slow queries in my.cnf (and restart mysql). Then you'll be able to find out what specific query is causing it.

Or. I think your sdb is broken. Try getting hardware checked out.

Edit : iotop (available through EPEL) is an awesome tool which lets you know which process cause iowait.

Solution 2:

Your sdb is acting unusual. Either the disk drive has become bad. If the traffic pattern on your websites is the same, and this is a new problem, then there is enough proof that you need to replace sdb.

There are two queues in the path of any IO in linux. One is the IO scheduler queue, controlled by nr_requests and another is the queue inside hardware. The merging of IO happens in the scheduler layer. So, when you see that the avgqu-sz is small i.e. average queue size is small while await is large and svctm is low, then it means that storage is taking time to service those IO requests.

Meaning, essentially slow storage or rather bad storage.

The %util shows that how much millseconds in 1000 millseconds an IO has taken to complete. The more it is the more hammered down your disk it is. That doesn't mean your disk is heavily hammered down but in your case it is slow, rather slow.