How do I compare two files containing several md5 checksums to determine changed files?

I have two files MD1 and MD2.

MD1 contains md5sums:

5f31caf675f2542a971582442a6625f6  /root/md5filescreator/hash1.txt
4efe4ba4ba9fd45a29a57893906dcd30  /root/md5filescreator/hash2.txt
1364cdba38ec62d7b711319ff60dea01  /root/md5filescreator/hash3.txt

where hash1, hash2 and hash3 are three files present in folder md5filescreator.

Similarly MD2 contains:

163559001ec29c4bbbbe96344373760a  /root/md5filescreators/hash1.txt
4efe4ba4ba9fd45a29a57893906dcd30  /root/md5filescreators/hash2.txt
1364cdba38ec62d7b711319ff60dea01  /root/md5filescreators/hash3.txt

where these files are in folder md5filescreators.

I want to compare the checksums in md5filescreator with the corresponding file's checksum in md5filecreators.

The shell script should return OK for files with same checksums and FALSE for those which are not, along with the file names.

Can this be done using md5sum --check (since it normally checks for any changes in only 1 MD5 file)?


I want to know if this can be done using md5sum --check? (since it normally checks for any changes in only 1 MD5 file).

No, it can't.

md5sum --check is meant to read the path to each file in the second column of the input files and check their MD5 checksum agains the checksum reported on the first column; if you want to directly compare the checksums in the two files, you'll have to compare the text files.

Using paste + AWK you could do:

paste file1 file2 | awk '{x = $1 == $3 ? "OK" : "FALSE"; print $2" "x}'
  • paste file1 file2: joins line N of file1 on line N of file2;
  • awk '{x = $1 == $3 ? "OK" : "FALSE"; print $2" "x}': if the first field is equal to the third field (i.e. the MD5 sums match), assigns "OK" to x, otherwise assigns "FALSE" to x and prints the second field (i.e. the filename) followed by the value of x.
% cat file1
5f31caf675f2542a971582442a6625f6 /root/md5filescreator/hash1.txt
4efe4ba4ba9fd45a29a57893906dcd30 /root/md5filescreator/hash2.txt
1364cdba38ec62d7b711319ff60dea01 /root/md5filescreator/hash3.txt
% cat file2
163559001ec29c4bbbbe96344373760a /root/md5filescreators/hash1.txt
4efe4ba4ba9fd45a29a57893906dcd30 /root/md5filescreators/hash2.txt
1364cdba38ec62d7b711319ff60dea01 /root/md5filescreators/hash3.txt
% paste file1 file2 | awk '{x = $1 == $3 ? "OK" : "FALSE"; print $2" "x}'
/root/md5filescreator/hash1.txt FALSE
/root/md5filescreator/hash2.txt OK
/root/md5filescreator/hash3.txt OK

A simple way of checking this would be to see which lines are not duplicated across both files:

sort file1 file2 | uniq --unique

uniq --unique prints those lines which haven't appeared again. Accordingly, those files whose hashes match will have duplicated lines, and won't appear in the output. To simply test if any output is produced, use grep:

sort file1 file2 | uniq --unique | grep -q .

In this case, since the directories are different, a bit more processing is needed:

awk -F/ '{print $1, $NF}' | sort | uniq --unique | awk '!a[$2]++{print $2}'

Or, entirely in awk:

awk -F/ 'FNR == NR {hash[$NF] = $1; next} hash[$NF] != $1 {print $NF}'

In both cases, you get just the filenames whose hashes differ.