How to md5 a list of filepaths contained in a file?
I have a folder containing many folders containing many files. Thousands.
I can do find . -type f > ./FILE-LISTING.TXT
to create a file containing many thousands of file paths that looks like this:
./Anders/Letters/20190101 Rent.pdf
./Anders/Letters/20190103 Appeal.pdf
./Anders/Letters/20190107 Decision.pdf
./Beeker/Letters/20180101 Rent.pdf
How would I feed that list of filepaths into md5
to produce an output formatted like this:
9cf14e4d666dcb6aab17763b02429a19 ./Anders/Letters/20190101 Rent.pdf
d1bb70baa31f1df69628c00632b65eab ./Anders/Letters/20190103 Appeal.pdf
7a0f5bc18688fe8ba32f43aa6ec53fb1 ./Anders/Letters/20190107 Decision.pdf
a0c96a79cf3b1847025d9f073151519d ./Beeker/Letters/20180101 Rent.pdf
NB: I want the md5 hashes of the referenced files, not the md5 of the list of files, nor the md5 hashes of the strings in the file-listing.txt.
Also, would it be faster to do it all in one command line, or do it in two passes (find
to create file-listing.txt, then md5
to create file-listing-md5.txt)?
Solution 1:
find . -type f -exec /sbin/md5 -r {} +
^^^^^^^ ^^^^^ ^^^^^^^^^^^^ ^^ ^
| | | | |
| | | | +- add as many file names as possible per call
| | | +---- replace with names of found files
| | +------------ command to run
| +--------------------- execute following command
+---------------------------- find any "normal" file
should do the trick (and take care of the usual issues with spaces etc within filenames).
As for faster: one pass is almost always faster than two passes. In the specific case the MD5 calculation takes so much time that other factors most probably can be ignored.
PS: Tip of the hat to @lhf for reminding me of -r
Solution 2:
Try this:
find . -type f -print0 | xargs -0 md5 -r
Note -print0
and -0
to handle spaces in filenames.
Compared to find . -type f -exec
, this solution runs md5
much less frequently, although this might not have a measurable impact.