How do I find files and total their sizes?

I'd like to find a series of files (based on a wildcard expression) and total their disk usage.

Something like this:

$ find . -name 'flibble*' -ctime +90 -exec du -sh {} \;

2.1G    ./flibble_116.log
2.1G    ./flibble_83.log
2.1G    ./flibble_211040_157.log
2.1G    ./flibble3747_51.log

This work. But it doesn't produce the result I'm looking for. It lists the space used by each file, as find is iterating through them.

What I want is the total du for all of the files found.


Solution

By supplying the option -c (or --total) to du(1), you can instruct it to produce a grand total. If your implementation of du(1) supports either of these options, you can achieve the desired effect using the following command:

$ find . -name 'flibble*' -ctime +90 -exec du -shc {} +

EDIT: Note that if the number of files exceeds the maximum number of parameters permitted by your system, find may still execute command multiple times. Some implementations of du(1) also support reading the filenames from a file, which does not suffer from the mentioned limitation:

$ find -name 'flibble*' -ctime +90 -print0 > filenames
$ du -shc --files0-from=filenames

Explanation

The difference between the semantics of -exec command {} \; and -exec command {} + is the following:

  • command {} \; executes command once for each result of find. The pathname of the result is passed instead of {}.

    $ touch 1 2 3
    $ find  1 2 3 -maxdepth 0 -exec echo {} \;
    1
    2
    3
    
  • command {} + executes command, when all the results have been retrieved. The pathnames of the results are passed instead of {}.

    $ touch 1 2 3
    $ find  1 2 3 -maxdepth 0 -exec echo {} +
    1 2 3
    

The -print0 option causes find(1) to print the found filenames to the standard output separated by the null character, and the --files0-from option caused du(1) to read the null-separated filenames. Unlike the new line character, the null character may not appear in a filename, so the output is unambiguous.

To learn more about the options of du(1) and find(1), you should consult the respective manpages:

$ man du
$ man find

Try this:

du -c `find . -name 'flibble*' -ctime +90` | tail -1

The original command is giving du one argument, then executing it, until it goes through all the arguments. This way, you simply are giving it all the arguments at once, then cutting off the separate sizes, and leaving only the total. You can remove the pipe and tail to show the size of each file if you'd like.


You can try this:

find . -name 'flibble*' -ctime +90 -exec du -ch {} + | grep total

I would have find itself print out the size, and use another tool to calculate the total:

find . -name 'flibble*' -ctime +90 -printf "%s\n" |
perl -lnE '$sum += $_} END {say $sum'

If you also want to see the filenames:

find . -name 'flibble*' -ctime +90 -printf "%s\t%p\n" |
perl -apE '$sum += $F[0]} END {say $sum'