Why Ubuntu is slow on massive network, disk I/O?

You could experiment with IO schedulers. The default IO scheduler is CFQ which works pretty well for desktops but its been my experience that for file servers Deadline tends to work better. You can change the IO Scheduler on the fly so you can experiment with it easily to see what works best in your situation.

To list the available io schedulers use this command.

cat /sys/block/sdb/queue/scheduler  

This should return noop anticipatory deadline [cfq]

To change your scheduler to deadline use the following command on the appropriate device.

sudo echo "deadline" > /sys/block/sdb/queue/scheduler

Try running iotop - it should show you something.


Do you see that many interrupts (System - in) and Context Switches (System - cs) during normal operation? I wonder because of your description of even the mouse cursor becoming slow. If there is a problem causing your system to be overwhelmed by interrupts under load this would cause everything to slow down.

And just to take a total shot in the dark, is there anything in /var/log/dmesg about errors or timeouts from your disks or raid devices?

Edit 1:

I ran across an article this morning that really sounded like the issue that you are seeing on your box. Greg Smith walks through an analysis of a server that seems to freeze disk writes for extended periods of time. His particular investigative method involves running the command:

while [ 1 ]; do cat /proc/meminfo; sleep 1; done

and looking at the "Writeback:" cache size before and during a period where the system seems to hang. If the writeback cache is indeed filling up (roughly >40% full) and causing the system to suspend writes while it flushes then Greg suggests some OS tuning that might mitigate the problem. Greg's blog entry can be found at http://notemagnet.blogspot.com/2008/08/linux-write-cache-mystery.html