How to md5 a list of filepaths contained in a file?

I have a folder containing many folders containing many files. Thousands.

I can do find . -type f > ./FILE-LISTING.TXT to create a file containing many thousands of file paths that looks like this:

./Anders/Letters/20190101 Rent.pdf
./Anders/Letters/20190103 Appeal.pdf
./Anders/Letters/20190107 Decision.pdf
./Beeker/Letters/20180101 Rent.pdf

How would I feed that list of filepaths into md5 to produce an output formatted like this:

9cf14e4d666dcb6aab17763b02429a19 ./Anders/Letters/20190101 Rent.pdf
d1bb70baa31f1df69628c00632b65eab ./Anders/Letters/20190103 Appeal.pdf
7a0f5bc18688fe8ba32f43aa6ec53fb1 ./Anders/Letters/20190107 Decision.pdf
a0c96a79cf3b1847025d9f073151519d ./Beeker/Letters/20180101 Rent.pdf

NB: I want the md5 hashes of the referenced files, not the md5 of the list of files, nor the md5 hashes of the strings in the file-listing.txt.

Also, would it be faster to do it all in one command line, or do it in two passes (find to create file-listing.txt, then md5 to create file-listing-md5.txt)?


Solution 1:

find . -type f -exec /sbin/md5 -r {} +
       ^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
          |      |        |       |  |
          |      |        |       |  +- add as many file names as possible per call
          |      |        |       +---- replace with names of found files
          |      |        +------------ command to run
          |      +--------------------- execute following command
          +---------------------------- find any "normal" file

should do the trick (and take care of the usual issues with spaces etc within filenames).

As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.

PS: Tip of the hat to @lhf for reminding me of -r

Solution 2:

Try this:

find . -type f -print0 | xargs -0 md5 -r

Note -print0 and -0 to handle spaces in filenames.

Compared to find . -type f -exec, this solution runs md5 much less frequently, although this might not have a measurable impact.