High CPU usage of flush process
I know that flush process is garbage collector of kernel, but in my case on two servers that process is really CPU intesive. In most of time it uses 80-100% of CPU.
2898 root 20 0 0 0 0 R 78 0.0 2900:22 flush-0:21
What can cause that. I thought about corrupted memory, but on two servers in one time? I think it started to happen after kernel upgrade. Maybe there is some known bug?
Edit:
Updated information. It's Gentoo Linux 64-bit, kernel version is 2.6.39-gentoo-r2. It has 8 GB of RAM. There is no much IO activity.
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 5.01 87.19 5.55 166452685 10596484
sdb 5.01 87.30 5.55 166662767 10596484
md0 10.05 160.74 2.75 306883505 5258392
md1 3.61 13.74 2.10 26229593 4006684
Weird thing is that IO activity on sda/sdb, these are swap partitions, that is turned off.
We are using only uwsgi procesess and some Python scripts running from crontab.
iostat -x 5
Linux 2.6.39-gentoo-r2 (python-1) 07/27/11 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
44.98 0.00 3.73 0.81 0.00 50.48
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 1.37 0.70 4.35 0.67 87.16 5.55 37.00 0.08 15.16 15.61 12.21 3.07 1.54
sdb 1.37 0.70 4.35 0.67 87.27 5.55 36.99 0.07 14.84 15.22 12.35 3.11 1.56
md0 0.00 0.00 9.36 0.69 160.67 2.76 32.51 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 3.11 0.50 13.76 2.09 8.79 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
68.24 0.00 25.01 0.30 0.00 6.45
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.20 1.20 0.80 4.80 4.00 8.80 0.01 7.10 7.50 6.50 7.10 1.42
sdb 0.00 0.20 1.00 0.80 4.80 4.00 9.78 0.01 7.00 6.00 8.25 7.00 1.26
md0 0.00 0.00 0.00 0.60 0.00 2.40 8.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 2.00 0.00 8.80 0.00 8.80 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
68.24 0.00 21.13 1.18 0.00 9.45
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 1.20 0.00 6.40 0.00 10.67 0.01 6.50 6.50 0.00 6.50 0.78
sdb 0.00 0.00 1.40 0.00 7.20 0.00 10.29 0.02 11.43 11.43 0.00 11.43 1.60
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 2.60 0.00 13.60 0.00 10.46 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
60.73 0.00 22.34 2.75 0.00 14.18
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 5.40 0.00 22.40 0.00 8.30 0.08 14.22 14.22 0.00 6.04 3.26
sdb 0.20 0.00 3.80 0.00 36.80 0.00 19.37 0.03 7.74 7.74 0.00 7.74 2.94
md0 0.00 0.00 7.00 0.00 48.80 0.00 13.94 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 2.40 0.00 10.40 0.00 8.67 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
74.22 0.00 20.08 1.25 0.00 4.45
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.20 2.20 10.80 0.60 92.00 11.20 18.11 0.07 5.81 5.78 6.33 5.81 6.62
sdb 0.60 2.20 11.60 0.60 144.80 11.20 25.57 0.08 6.92 6.83 8.67 6.25 7.62
md0 0.00 0.00 22.00 2.40 226.40 9.60 19.34 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 1.20 0.00 10.40 0.00 17.33 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
69.17 0.00 21.25 0.85 0.00 8.72
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.60 0.00 2.40 0.00 8.00 0.00 6.00 6.00 0.00 6.00 0.36
sdb 0.00 0.00 0.80 0.00 7.20 0.00 18.00 0.01 9.75 9.75 0.00 9.75 0.78
md0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 1.40 0.00 9.60 0.00 13.71 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
56.99 0.00 22.66 3.63 0.00 16.73
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 1.60 1.20 8.00 4.80 9.14 0.02 8.00 10.62 4.50 7.21 2.02
sdb 0.00 0.00 1.40 1.20 7.20 4.80 9.23 0.02 8.38 10.71 5.67 8.15 2.12
md0 0.00 0.00 0.40 0.80 1.60 3.20 8.00 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 2.60 0.00 13.60 0.00 10.46 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
68.65 0.00 25.95 1.55 0.00 3.85
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.40 45.80 43.20 19.40 445.60 260.80 22.57 0.48 7.71 9.05 4.73 4.67 29.26
sdb 1.00 45.80 48.00 19.40 607.20 260.80 25.76 0.56 8.26 9.70 4.70 4.06 27.36
md0 0.00 0.00 102.40 64.40 1020.00 257.60 15.32 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 6.80 0.00 33.60 0.00 9.88 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
67.86 0.00 22.76 2.03 0.00 7.35
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 8.80 1.00 74.20 0.80 590.40 7.20 15.94 0.26 3.46 3.44 4.50 3.07 23.06
sdb 2.20 1.00 77.80 0.80 552.00 7.20 14.23 0.31 3.94 3.92 6.00 3.30 25.96
md0 0.00 0.00 115.20 1.40 907.20 5.60 15.66 0.00 0.00 0.00 0.00 0.00 0.00
md1 0.00 0.00 48.00 0.00 234.40 0.00 9.77 0.00 0.00 0.00 0.00 0.00 0.00
flush
is not "garbage collection in the kernel". I'm not sure where you read that but C has no garbage collector. There is no automatic memory management. C programmers have to manage their own memory allocations. Anyway...
flush
is the process which the virtual memory subsystem uses to write dirty pages out to disk. This is called pdflush
or bdflush
in the EL kernels.
When you perform any (non-direct) IO, the file you access goes into the pagecache. If you read a text file from disk, that textfile now exists on the disk and in cache memory. If you make some changes and save that file, the actual write()
system call that your text editor makes completes against the copy of the file in memory. This way, the system call returns quickly and your text editor can get back to accepting your input, rather than pausing while it writes data out to the (relatively) slow hard disk. The memory pages which that modified text file occupies are now called "dirty pages". The kernel worries about putting the actual block writes out to disk at a later time, this is called "syncing" or "flushing dirty pages".
You'd expect to see a flush process get high CPU usage at times, because the process will be performing IO to disk and will probably be spending most of its time in iowait.
There is nothing to worry about. Your system is behaving like every other Linux system.