Why are journald logfiles so huge?
When I do a journalctl --disk-usage
it says something about 300MB size of the journal files but when I look at the actual text with journalctl | wc -c
it's something about 28MB. Well, journald has compression and even considering the metadata like timestamp, uid, message hash and such things it seems to me like a ridiculous waste of disk space.
Can someone tell me why the journal files are so big compared to the actual text inside?
There are two reasons. First, as @Mella mentioned, there is the difference between the current-log vs all-logs.
Second, as documented in man journalctl
, there a number of output formats. You were measuring the size of the most-compact/least-detailed. To see maximal data in the systemd journal, use:
journactl --output=verbose
In my case, the compact journal output returns 32 Megs of data, while 128 MB are returned with --output=verbose
and 152M are found with journalctl --disk-usage
, covering both active and archived journals.
See man journald.conf
to learn how to limit how much disk space journald
uses if you are concerned.
- They are huge, because its kind of a bug:
As it is indicated upstream and hence known to the developers of journald, the used in the binary log format is not at all very great (yet?).
- They are huge, because maybe the Compression is not activated
There also is a option in /etc/systemd/journald.conf
named Compress=yes
, which might not be active on your system, so as there being effectively no compression.
- The issue of archived journals does not matter here.
While in principle true that journald distinguishes between active and archived journal logs, this is a misleading reply of the other answers, as in man journalctl
it states unequivically:
Output is interleaved from all accessible journal files, whether they are rotated or currently being written, and regardless of whether they belong to the system itself or are accessible user journals.
The other answer are hence misleading here.
- They disk usage of journalctl is huge (i.e greater than plain text files with comparable level of information - that is fields) because of some file allocation, fragmentation, anti-corruption measures.
"file fragmentation/allocation issues"
On my box, journalctl --version == "systemd 239[...]"
the journal files that contain the data exists in filesizes being a multiples of 8MiB. As a consequence on my system journal file, will be 8MiB big even when only a fraction (as in one case 56kiB) of data is actually stored in it.
"anti corruption issue"
According to Poettering one of the developers of journald
and systemd
in a case that a journal is considered to have become corrupted by journald
, it wont be "fixed" but instead left as is, to prevent further problems. (see https://bugs.freedesktop.org/show_bug.cgi?id=64116#c3)
This of course means that there is a good chance that uncompressed, almost empty journal binary log files sit arround in your var log, making it effectively much much huger than a sane plaintext alternative.