Troubleshooting poor disk IO performance

Note that while this question is a little Redis-specific, the main problem is generic: a process takes so much HDD IO write bandwidth that other processes can't write anything.

We've got an Ubuntu VM inside Ubuntu-based Xen XCP host (installed on two HDDs in software RAID1). That VM is running Redis server under about 2K commands/s load.

Problem: when said Redis server does BGREWRITEAOF, it blocks its clients for about 10 seconds.

Details:

Only AOF persistence is used, no RDB. Redis is configured to fsync AOF file once per second.

On BGREWRITEAOF Redis forks and does all disk-intensive work in the child process. Meanwhile, main process keeps appending data to its AOF file.

BGREWRITEAOF takes about 10 seconds (1.5GBs of data, 150 MB/s disk write speed). The child process doing rewrite consumes all HDD IO write throughput.

Parent process attempts to fsync, it takes more than two seconds, data protection kicks in, and blocking write gets called, blocking the parent process until BGREWRITEAOF is finished with the disk.

Here is a detailed info and discussion that lead me to above interpretation of events.

Question: It looks fishy to me that a process is allowed to take so much disk IO that everything else is blocked. Is there something that I can do on system level to fix that? I'm OK if BGREWRITEAOF will take a little more time, as long as parent process is allowed to save its data while rewrite is active.

Please note that I'm aware of workarounds, like moving AOF persistence to slave, using no-appendfsync-on-rewrite Redis config option etc.; this question is specifically about resolving the problem, not working around it.


Solution 1:

AFAICS you can try to change IO scheduler. Try to use this command:

echo cfq > /sys/block/$DEVICE/queue/scheduler

Where $DEVICE is your RAID1 disk. This command install 'Completely Fair Queuing' scheduler for your device.