Compare two different files line by line in python
Solution 1:
This solution reads both files in one pass, excludes blank lines, and prints common lines regardless of their position in the file:
with open('some_file_1.txt', 'r') as file1:
with open('some_file_2.txt', 'r') as file2:
same = set(file1).intersection(file2)
same.discard('\n')
with open('some_output_file.txt', 'w') as file_out:
for line in same:
file_out.write(line)
Solution 2:
Yet another example...
from __future__ import print_function #Only for Python2
with open('file1.txt') as f1, open('file2.txt') as f2, open('outfile.txt', 'w') as outfile:
for line1, line2 in zip(f1, f2):
if line1 == line2:
print(line1, end='', file=outfile)
And if you want to eliminate common blank lines, just change the if statement to:
if line1.strip() and line1 == line2:
.strip()
removes all leading and trailing whitespace, so if that's all that's on a line, it will become an empty string ""
, which is considered false.
Solution 3:
If you are specifically looking for getting the difference between two files, then this might help:
with open('first_file', 'r') as file1:
with open('second_file', 'r') as file2:
difference = set(file1).difference(file2)
difference.discard('\n')
with open('diff.txt', 'w') as file_out:
for line in difference:
file_out.write(line)
Solution 4:
If order is preserved between files you might also prefer difflib
. Although Robᵩ's result is the bona-fide standard for intersections you might actually be looking for a rough diff-like:
from difflib import Differ
with open('cfg1.txt') as f1, open('cfg2.txt') as f2:
differ = Differ()
for line in differ.compare(f1.readlines(), f2.readlines()):
if line.startswith(" "):
print(line[2:], end="")
That said, this has a different behaviour to what you asked for (order is important) even though in this instance the same output is produced.