kjournald reasons for high usage

I'm trying to figure out why is kjournald going crazy on my machine. It's an 8-core box with loads of memory. It's got ~50% cpu load.

The iotop doesn't seem to point at any specific processes - some bursts of writes here and there (mostly cron starting, some monitoring stats generated, etc.) When I used sys/vm/block_dump to gather the write statistics, I got lists like this:

kjournald(1352): 1909
sendmail(28934): 13
cron(28910): 12
cron(28912): 11
munin-node(29015): 3
cron(28913): 3
check_asterisk_(28917): 3
sh(28917): 2
munin-node(29022): 2
munin-node(29021): 2

Where kjournald actions are just WRITEs.

Why is that happening? What else should I look at to limit the kjournald activity a bit? It seems disproportionate to what's actually being written.


Solution 1:

kjournald is responsible for the journal of ext3 (journaling filesystem). It's known to use a lot of CPU under certain loads. There's not much to do except use another filesystem or disable journaling (effectively making the fs ext2).

Theoretically you can use one of the other modes of ext3 journaling and check if the CPU usage goes down, but remember that each method is a compromise on the safety of the data being written to the disk. You have ordered mode, writeback mode and 'everything' mode.

  1. Ordered: journal only metadata, but assures that data related to a metadata is saved before commiting the metadata changes to the journal.
  2. writeback: journal only metadata, but has no guarantee that the data is saved before the journal commit.
  3. journal: everything is journaled, data and metadata. It may be slow but YMMV.

You set the mode using the option data= when mounting the system, like data=ordered.

Solution 2:

By default your ext3 filesystem is going to be mounted with atimes turned on. Each time a file or directory is read/accessed the filesystem will have to write back to the disks to update this atime record. This means that even if your workload is mostly read based you'll still need to hit the disks to update the access times of each file & directory, and this is my guess as to why your kjournald process was writing so many blocks.

Turning off atime's will yield a large boost to performance but will break POSIX compliance. Check out this Wikipedia article for some discussion around the criticism of atime's.

To turn off atimes just add noatime to the mount options for your filesystem, or you can remount as suggested by poige. Here's an example for your root filesystem:

mount -o remount,noatime /

Solution 3:

If perfectness of the data is not important: do this

iostat -o -a

Make sure that it's really kjournald. It what causes my server to crash.

Changing hard drive to SSD would work.

When you see kjournald writing 5-10MB of data you do

http://ubuntuforums.org/showthread.php?t=56621

sudo tune2fs -O ^has_journal /dev/sda1
sudo e2fsck /dev/sda1

where sda1 is the name of your partition

Report result in comment so I can further check.