kworker consuming +90% IO and zero disk write
this is a standard apache web server on AWS Linux AMI + EBS. We are noticing high load average (+8) and iotop -a
shows:
Total DISK READ: 0.00 B/s | Total DISK WRITE: 2.37 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
3730 be/4 root 0.00 B 0.00 B 0.00 % 91.98 % [kworker/u8:1]
774 be/3 root 0.00 B 1636.00 K 0.00 % 15.77 % [jbd2/xvda1-8]
3215 be/4 apache 0.00 B 40.39 M 0.00 % 0.88 % httpd
3270 be/4 apache 0.00 B 38.20 M 0.00 % 0.93 % httpd
2770 be/4 apache 0.00 B 46.86 M 0.00 % 0.71 % httpd
When apache is down, kworker and jbd2 is also down.
Server is not swapping as we have plenty of RAM available. I've seen this issue related to Database servers, but nothing only isolated to Apache.
Any idea on how to diagnose this further and prevent it?
UPDATE 1: perf report (perf record -g -a sleep 10)
Samples: 114K of event 'cpu-clock', Event count (approx.): 28728500000
- 83.58% swapper [kernel.kallsyms] [k] xen_hypercall_sched_op ◆
+ xen_hypercall_sched_op ▒
+ default_idle ▒
+ arch_cpu_idle ▒
- cpu_startup_entry ▒
70.16% cpu_bringup_and_idle ▒
- 29.84% rest_init ▒
start_kernel ▒
x86_64_start_reservations ▒
xen_start_kernel ▒
+ 1.73% httpd [kernel.kallsyms] [k] __d_lookup_rcu ▒
+ 1.08% httpd [kernel.kallsyms] [k] xen_hypercall_xen_version ▒
+ 0.38% httpd [vdso] [.] 0x0000000000000d7c ▒
+ 0.36% httpd libphp5.so [.] zend_hash_find ▒
+ 0.33% httpd libphp5.so [.] _zend_hash_add_or_update ▒
+ 0.25% httpd libc-2.17.so [.] __memcpy_ssse3 ▒
+ 0.24% httpd libphp5.so [.] _zval_ptr_dtor ▒
+ 0.24% httpd [kernel.kallsyms] [k] __audit_syscall_entry ▒
+ 0.22% httpd [kernel.kallsyms] [k] pvclock_clocksource_read ▒
100% IO doesn't mean it's using all your IO operations. It means it's doing nothing but waiting on IO. Therefore, high %IO with low/zero disk bandwidth can be normal.
man iotop
:
[...] It also displays the percentage of time the thread/process spent while swapping in and while waiting on I/O.
It may be a different issue if your kworker
is waiting on IO forever, but I don't know. Maybe it's supposed to be waiting on a pipe or something. I see kworker
doing the same on my server sometimes, and it doesn't seem to be a problem. (I also panicked the first time I saw it.)