Not the same output format from `df` in different Linux distributions
In Ubuntu the output of this command
df --exclude={tmpfs,devtmpfs,squashfs,overlay} | sed -e /^Filesystem/d | awk '{print $6 " " $1 " " $3 " " $4 " " $5}'
is:
/ /dev/mapper/dockerVG-rootLV 8110496 40591632 17%
/dockerssd /dev/mapper/ssdVG-ssdLV 214133656 274642488 44%
/dockerhdd /dev/mapper/hddVG-hddLV 83278236 1385191240 6%
/var/lib/docker /dev/mapper/hddVG-dockerLV 76046204 412729940 16%
That is what I need.
On CentOS 6 I get this output:
/dev/mapper/vg_rproxy-lv_root
51475068 43192316 12% /
/boot /dev/sda1 82688 379364 18%
/dev/mapper/vg_rproxy-lv_home
77349888 73119692 1% /home
It's a mess.
Full output from the CentOS 6:
$ df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/vg_rproxy-lv_root
51475068 5661336 43192292 12% /
tmpfs 957140 0 957140 0% /dev/shm
/dev/sda1 487652 82688 379364 18% /boot
/dev/mapper/vg_rproxy-lv_home
77349888 294352 73119692 1% /home
What is the problem? How can I fix it?
tl;dr
Use df -P
.
Full answer
/dev/mapper/vg_rproxy-lv_root
and /dev/mapper/vg_rproxy-lv_home
are relatively long strings. It appears df
in CentOS decides to split their entries to two lines, this breaks the logic when you want to parse the output further.
In narrow terminals this may be a good thing, creating semi-columnized human-readable output even despite limited horizontal space. I would prefer this not to happen when df
writes to a non-tty (a pipe in your case).
Maybe df
in Ubuntu behaves similarly if entries in the Filesystem
column are long; maybe you just didn't experience this because yours are relatively short. I don't know, this is not important. What is important is df
is a POSIX tool and should follow the specification. But the specification explicitly states:
Historical
df
implementations vary considerably in their default output. It was therefore necessary to describe the default output in a loose manner to accommodate all known historical implementations and to add a portable option (-P
) to provide information in a portable format.
About the option:
-P
Produce output in the format described in the STDOUT section.
And finally the relevant part of the STDOUT section (emphasis mine):
The implementation may adjust the spacing of the header line and the individual data lines so that the information is presented in orderly columns.
The remaining output with
-P
shall consist of one line of information for each specified file system. These lines shall be formatted as follows:"%s %d %d %d %d%% %s\n", <file system name>, <total space>, <space used>, <space free>, <percentage used>, <file system root>
So df
is allowed to output anything, unless you use -P
. Without -P
some implementations of df
may produce predictable and parsable output, others… not so much. Their behavior may or may not be documented well enough. Therefore in general, when parsing the output of df
you should always use -P
.
Just adding -P
will probably be enough to fix your specific problem.
Note -P
governs the format only. Overall POSIX specification applies only in the POSIX locale. Additionally modern implementations of df
tend to use 1024-byte blocks by default, while POSIX states the default is 512. In my Debian 10 df
from GNU coreutils falls back to the POSIX default when POSIXLY_CORRECT
is set in the environment. Portably you can force 1024-byte blocks with -k
.
This is a portable command that produces (almost) parsable output:
LC_ALL=POSIX df -Pk
Almost parsable, because entries in the Filesystem
column may contain spaces, I think; although in a sanely configured OS they don't.
You may omit LC_ALL=POSIX
and still get expected results, but in general it should be there for parsing. E.g. in my Polish locale your sed -e /^Filesystem/d
doesn't do its job because I get a Polish term for "filesystem" from my df
. LC_ALL=POSIX
fixes this. Still my personal preference is not to rely on anything in the header. I would use sed 1d
or tail -n +2
; or delegate the task to awk
, since awk
is already in your pipeline. This would be:
LC_ALL=POSIX df -Pk --exclude={tmpfs,devtmpfs,squashfs,overlay} \
| awk 'NR>1 {print $6 " " $1 " " $3 " " $4 " " $5}'
Finally --exclude=
is not a portable option. Apparently it works for you in both systems in question, it may not work in other systems though.