regex help on unix df

I need some help tweaking my code to look for another attribute in this unix df output:

Ex.

Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/ad4s1e     61G     46G    9.7G    83%    /home

So far I can extract capacity, but now I want to add Avail.

Here is my perl line that grabs capacity. How do I get "Avail"?? Thanks!

my @df = qx (df -k /tmp);
my $cap;
foreach my $df (@df)
        {
         ($cap) =($df =~ m!(\d+)\%!);
        };

print "$cap\n";

The easy perl way:

perl -MFilesys::Df -e 'print df("/tmp")->{bavail}, "\n"'

This has the merit of producing a nice data structure for you to query all the info about each filesystem.

# column headers to be used as hash keys
my @headers = qw(name size used free capacity mount);

my @df = `df -k`;
shift @df;  # get rid of the header

my %devices;
for my $line (@df) {
    my %info;
    @info{@headers} = split /\s+/, $line;  # note the hash slice
    $info{capacity} = _percentage_to_decimal($info{capacity});
    $devices{ $info{name} } = \%info;
}

# Change 12.3% to .123
sub _percentage_to_decimal {
    my $percentage = shift;
    $percentage =~ s{%}{};
    return $percentage / 100;
}

Now the information for each device is in a hash of hashes.

# Show how much space is free in device /dev/ad4s1e
print $devices{"/dev/ad4s1e"}{free};

This isn't the simplest way to do it, but it is the most generally useful way to work with the df information putting it all in one nice data structure that you can pass around as needed. This is better than slicing it all up into individual variables and its a technique you should get used to.

UPDATE: To get all the devices which have >60% capacity, you'd iterate through all the values in the hash and select those with a capacity greater than 60%. Except capacity is stored as a string like "88%" and that's not useful for comparison. We could strip out the % here, but then we'd be doing that everywhere we want to use it. Its better to normalize your data up front, that makes it easier to work with. Storing formatted data is a red flag. So I've modified the code above which reads from df to change the capacity from 88% to .88.

Now its easier to work with.

for my $info (values %devices) {
    # Skip to the next device if its capacity is not over 60%.
    next unless $info->{capacity} > .60;

    # Print some info about each device
    printf "%s is at %d%% with %dK remaining.\n",
        $info->{name}, $info->{capacity}*100, $info->{free};
}

I chose to use printf here rather than interpolation because it makes it a bit easier to see what the string will look like when output.


Have you tried simply splitting on whitespace and taking the 4th and 5th columns?

my @cols = (split(/\s+/, $_));
my $avail = $cols[3];
my $cap   = $cols[4];

(Fails if you have spaces in your device names of course...)


Us split instead, and get the args from the resulting array. E.g.

my @values = split /\s+/, $df;
my $avail = $values[3];

Or:

($filesystem, $size, $used, $avail, $cap, $mount) = split /\s/, $df;