How to diff large files on Linux

Solution 1:

cmp does things byte-by-byte, so it probably won't run out of memory (just tested it on two 7 GB files) -- but you might be looking for more detail than a list of "files X and Y differ at byte x, line y". If the similarities of your files are offset (e.g., file Y has an identical block of text, but not at the same location), you can pass offsets to cmp; you could probably turn it into a resynchronizing compare with a small script.

Aside: In case anyone else lands here when looking for a way to confirm that two directory structures (containing very large files) are identical: diff --recursive --brief (or diff -r -q for short, or maybe even diff -rq) will work and not run out of memory.

Solution 2:

I found this link

diff -H might help, or you can try installing the textproc/2bsd-diff port which apparently doesn't try to load the files into RAM, so it can work on large files more easily.

I'm not sure if you tried those two options or if they might work for you. Good luck.

Solution 3:

If the files are identical (same length) except for a few byte values, you can use a script like following (w is the number of bytes per line to hexdump, adjust to your display width):

w=12;
while read -ru7 x && read -ru8 y;
do
  [ ".$x" = ".$y" ] || echo "$x | $y";
done 7< <(od -vw$w -tx1z FILE1) 8< <(od -vw$w -tx1z FILE2) > DIFF-FILE1-FILE2 &

less DIFF-FILE1-FILE2

It's not very fast, but does the job.

Setting mouse sensitivity per device in Windows 7?

How do I perform a case-sensitive search in Google Chrome? [duplicate]

Bidirectional clipboard not working in VirtualBox

Is there a way to convert a non-Chocolatey installed program into a Chocolatey installed one?

Can "Pend" be used as a transitive verb? [closed]

Does 'contact number' in BrE refer to the act of contacting or to an electrical telephone contact?

Is there a word for the phenomenon of knowing a compromise will happen and intentionally overbending the truth so the conclusion is more truthful?

Non-standard sentence construction with "there is no"

Is there a metaphor to say "it's named this way for a reason"

Delivery (childbirth) at home, at a home, in a home?

Possessive pronoun for royalty/nobility