Comparing two text files
Try this command:
grep -v -f file2.csv file1.csv > file3.csv
According to grep manual:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing. (-f is
specified by POSIX.)
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v
is specified by POSIX.)
As Steeldriver said in his comment is better add also -x
and -F
that:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by
newlines, any of which is to be matched. (-F is specified by
POSIX.)
-x, --line-regexp
Select only those matches that exactly match the whole line.
(-x is specified by POSIX.)
So, better command is:
grep -xvFf file2.csv file1.csv > file3.csv
This command use file2.csv
line as pattern and print line of file1.csv
that doesn't match (-v
).
In order to be able to use comm
, you have to sort the lines first.
comm -23 <(sort file1.csv) <(sort file2.csv) > file3.csv
A python option:
#!/usr/bin/env python3
import sys
def readfile(file):
with open(file) as src:
return [line.strip() for line in src.readlines()]
lines_1 = readfile(sys.argv[1]); lines_2 = readfile(sys.argv[2])
for line in lines_1:
if not line in lines_2:
print(line)
Output:
1,4,5,6
1,11,13,17
Paste the script into an empty file as extract.py
, make it executable and run it by the command:
<script> <file_1> <file_2>
Or, to write it directly to file_3:
<script> <file_1> <file_2> >file_3
Using diff
command do grep
and no storing required.
Output if lines exist in file1 but not in file2:
$ diff file{1,2}.csv | grep -Po "^< \K.*"
1,4,5,6
1,11,13,17
And output if lines exist in file2 but not in file1, with just changing left angle(<
) to right angle(>
):
$ diff file{1,2}.csv | grep -Po "^> \K.*"
2,4,9,10
13,14,17,18