How to automate comparison of md5sum hash values for a large number of files
I can check md5sum hash of a file from a terminal as,
$ md5sum my_sensitive_file
8dad53cfc973c59864b8318263737462 my_sensitive_file
But the difficult part is to compare the hash value with exact one.
It is difficult to compare the 32 characters output with original/exact hash value by any human for a large numbers of files. First of all the job would be very monotonous and there are big scope of errors.
Is it possible to automate the comparing process, preferably in CLI?
Solution 1:
For example I have a file called test_binary
.
MD5 sum of file test is ef7ab26f9a3b2cbd35aa3e7e69aad86c
To test it automatically run this:
$ md5sum -c <<<"ef7ab26f9a3b2cbd35aa3e7e69aad86c *path/to/file/test_binary"
test_binary: OK
or
$ echo "595f44fec1e92a71d3e9e77456ba80d1 filetohashA.txt" | md5sum -c -
Quote from man
-c, --check
read MD5 sums from the FILEs and check them
Quote from wiki
Note: There must be two spaces between each md5sum value and filename to be compared. Otherwise, the following error will result: "no properly formatted MD5 checksum lines found".
Link to wiki
Also you can just read md5 hashes from file
$ md5sum -c md5sum_formatted_file.txt
It is expecting file with format:
<md5sum_checksum><space><space><file_name>
About *
and <space>
after MD5 sum hash. There is little note in man:
When checking, the
input should be a former output of this program. The default mode is
to print a line with checksum, a character indicating input mode ('*'
for binary, space for text), and name for each FILE.
And here is link to stackoverflow where I found answer on question, why should we, sometimes, distinguish binary
files and text
files.
Solution 2:
One possibility is to use the utility cfv
sudo apt-get install cfv
CFV supports many types of hashes, and both testing and hash file creation.
# List the files
$ ls
test.c
# Create a hash file
$ cfv -tmd5 -C
temp.md5: 1 files, 1 OK. 0.001 seconds, 302.7K/s
# Test the hash file
$ cfv -tmd5 -T
temp.md5: 1 files, 1 OK. 0.001 seconds, 345.1K/s
# Display the hash file
$ cat *.md5
636564b0b10b153219d6e0dfa917d1e3 *test.c
Solution 3:
Yes, asterisk *
is required for this command. Take a look at this example.
This is the binary file, and let say the correct md5sum value is exampleofcorrectmd5value00000000
(32 hexadecimal char)
[root@Linux update]# ls -lh
total 137M
-rw-r--r-- 1 root root 137M Nov 5 13:01 binary-file.run.tgz
[root@Linux update]#
-c, --check
read MD5 sums from the FILEs and check them
If the md5sum value match with the binary file, you'll get this output
[root@Linux ~]# md5sum -c <<< "exampleofcorrectmd5value00000000" *binary-file.run.tgz"
binary-file.run.tgz: OK
[root@Linux ~]#
And this is when the md5sum value doesn't match
[root@Linux update]# md5sum -c <<< "exampleofwrongmd5value0000000000 *binary-file.run.tgz"
binary-file.run.tgz: FAILED
md5sum: WARNING: 1 of 1 computed checksum did NOT match
[root@Linux update]#
Without asterisk *
, you'll get the following error message even thought the md5 value is correct
[root@Linux ~]# md5sum -c <<< "exampleofcorrectmd5value00000000 binary-file.run.tgz"
md5sum: standard input: no properly formatted MD5 checksum lines found
[root@Linux ~]#
Also, you'll get the same error message if md5sum doesn't have 32 hexadecimal characters in it. In this example, it only has 31 characters.
[root@Linux ~]# md5sum -c <<< "exampleofmd5valuelessthan32char *binary-file.run.tgz"
md5sum: standard input: no properly formatted MD5 checksum lines found
[root@Linux ~]#
Solution for many files
If you have many files and want to automate the process, you can follow these steps:
user@Ubuntu:~$ ls -lh
total 12K
-rw-rw-r-- 1 user user 4 Nov 5 14:54 file-a
-rw-rw-r-- 1 user user 4 Nov 5 14:54 file-b
-rw-rw-r-- 1 user user 4 Nov 5 14:54 file-c
user@Ubuntu:~$
Generate md5sum for each files and save it to md5sum.txt
user@Ubuntu:~$ md5sum * | tee md5sum.txt
0bee89b07a24ae27c83fc3d5951213c1 file-a
1b2297c171a9a450d184871ccf6c9ad4 file-b
7f4d13d9b0b6ac086fd68637067435c5 file-c
user@Ubuntu:~$
To check md5sum for all files, use the following command.
user@Ubuntu:~$ md5sum -c md5sum.txt
file-a: OK
file-b: OK
file-c: OK
user@Ubuntu:~$
This is example if the md5sum value doesn't match with the file. In this case, I'm going to modify file-b
content
user@Ubuntu:~$ echo "new data" > file-b
user@Ubuntu:~$
See, this is the error message. Hope this helps.
user@Ubuntu:~$ md5sum -c md5sum.txt
file-a: OK
file-b: FAILED
file-c: OK
md5sum: WARNING: 1 computed checksum did NOT match
user@Ubuntu:~$