How do I do a binary diff on two identically sized files under Linux?
I have two identically sized files, and I need to do a binary diff to check whether they're the same.
I'm currently runnnig diff file1.img file2.img
but it's taking quite a while to process my 4 GB files. Is this the most efficient way to do this?
cmp
is designed to find differences in binary files. You might also try checksumming (sum
) and compare the hashes.
One of the most common ways of determining if two files are identical (assuming their sizes match) is using a program to create a "hash" (essentially a fingerprint) of a file. The most common ones are md5sum
and sha1sum
.
For example:
$ md5sum file1 file2
e0e7485b678a538c2815132de7f9e878 file1
4a14aace18d472709ccae3910af55955 file2
If you have many files that you need to check, for example if you are transferring a directory full of files from one system to another, you can redirect the output from the original system to a file, then md5sum
/sha1sum
can automatically use that file to tell you which files are different:
$ md5sum file1 file2 > MD5SUMS
... copy file1, file2, MD5SUMS across
$ md5sum --check MD5SUMS
file1: OK
file2: OK