How to count differences between two files on linux?

I need to work with large files and must find differences between two. And I don't need the different bits, but the number of differences.

To find the number of different rows I come up with

diff --suppress-common-lines --speed-large-files -y File1 File2 | wc -l

And it works, but is there a better way to do it?

And how to count the exact number of differences (with standard tools like bash, diff, awk, sed some old version of perl)?


If you want to count the number of lines that are different use this:

diff -U 0 file1 file2 | grep ^@ | wc -l

Doesn't John's answer double count the different lines?


diff -U 0 file1 file2 | grep -v ^@ | wc -l

That minus 2 for the two file names at the top of the diff listing. Unified format is probably a bit faster than side-by-side format.


If using Linux/Unix, what about comm -1 file1 file2 to print lines in file1 that aren't in file2, comm -1 file1 file2 | wc -l to count them, and similarly for comm -2 ...?


Since every output line that differs starts with < or > character, I would suggest this:

diff file1 file2 | grep ^[\>\<] | wc -l

By using only \< or \> in the script line you can count differences only in one of the files.


I believe the correct solution is in this answer, that is:

$ diff -y --suppress-common-lines a b | grep '^' | wc -l
1