High load average, low cpu

My server has slowed down, and I don't know why.

Print from top:

top - 14:32:50 up 639 days,  6:30,  1 user,  load average: 67.93, 70.63, 79.85
Tasks: 245 total,   1 running, 244 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.9% us,  0.5% sy,  0.0% ni, 94.5% id,  1.0% wa,  0.0% hi,  0.0% si
Mem:   1034784k total,  1021256k used,    13528k free,     4360k buffers
Swap:  1023960k total,   635752k used,   388208k free,    36632k cached

vmstat 10 6

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0 110 795604  12328   3980  46676    0    0     0     0    0     0  4  1 95  1
 0 97 788848  12052   3960  46256 2985   33  3323    33  429     0  2  1  0 97
 0 119 782660  13992   4096  45740 2780   14  2995   360  435     0  2  1  1 96
 0 121 775924  15600   3724  42796 3084    0  3443   136  440     0  2  1  0 98
 0 113 769392  13576   3476  41968 3002    0  3458     7  426     0  2  1  0 97
 0 113 762284  12440   3332  34884 3151    0  3553    61  427     0  2  1  0 97

doitprod2:/var/log# grep -c processor /proc/cpuinfo

2

iostat 2

 tps   Blk_czyt/s   Blk_zapis/s  Blk_czyt   Blk_zapis
sda             166,00      7128,00        52,00      14256        104

Ok, after kill and start some processes is now fine. Thx for Your help anyway.


You are probably having processes on UNINTERRUPTIBLE_SLEEP state, normally they are on that state because they are waiting on something from hardware like reading from a disk. Those processes are effectively sleeping (you have 244 sleeping processes) but they enter the Load Average calculation. Check your server IO with vmstat and see if you have many D statuses on top or ps to confirm.

EDIT: Seeing your vmstat output further confirms the IO problem. Your b column under procs show an average of 100 processes on uninterruptible sleep. Your bi column (blocks read from a block device) is very high, as well as the si column (memory read from swap/disk). Finally, under the cpu header, the wa column shows that your CPU spends more than 90% waiting for IO to complete.

You must check why you are having those problems with IO. It can be lack of server's capacity, processes running wild and some other reasons, but it's definitively IO.


Check the 1.0% wa you have on top if it gets high. Since you have hit the swap file, there might be a possibility for the processes to wait for I/O

Check cat /proc/sys/fs/file-nr if the first number is close to the third one (open files vs total open files).

Are you on a VPS?


The iowait (listed in the last column as "wa") shown in the vmstat output is very high. And there's lots of paging going on (pages of data being moved between the physical memory and disk based swap).

This machine would benefit from more physical ram.