How can I get diff to show only added and deleted lines? If diff can't do it, what tool can?

How can I get diff to show only added and deleted lines? If diff can't do it, what tool can?


Solution 1:

Try comm

Another way to look at it:

  • Show lines that only exist in file a: (i.e. what was deleted from a)

      comm -23 a b
    
  • Show lines that only exist in file b: (i.e. what was added to b)

      comm -13 a b
    
  • Show lines that only exist in one file or the other: (but not both)

      comm -3 a b | sed 's/^\t//'
    

(Warning: If file a has lines that start with TAB, it (the first TAB) will be removed from the output.)

Sorted files only

NOTE: Both files need to be sorted for comm to work properly. If they aren't already sorted, you should sort them:

sort <a >a.sorted
sort <b >b.sorted
comm -12 a.sorted b.sorted

If the files are extremely long, this may be quite a burden as it requires an extra copy and therefore twice as much disk space.

Solution 2:

To show additions and deletions without context, line numbers, +, -, <, > ! etc, you can use diff like this:

diff --changed-group-format='%<%>' --unchanged-group-format='' a.txt b.txt 

For example, given two files:

a.txt

Common
Common
A-ONLY
Common

b.txt

Common
B-ONLY
Common
Common

The following command will show lines either removed from a or added to b:

diff --changed-group-format='%<%>' --unchanged-group-format='' a.txt b.txt 

output:

B-ONLY
A-ONLY

This slightly different command will show lines removed from a.txt:

diff --changed-group-format='%<' --unchanged-group-format='' a.txt b.txt 

output:

A-ONLY

Finally, this command will show lines added to a.txt

diff --changed-group-format='%>' --unchanged-group-format='' a.txt b.txt 

output

B-ONLY

Solution 3:

comm might do what you want. From its man page:

DESCRIPTION

Compare sorted files FILE1 and FILE2 line by line.

With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains lines unique to FILE2, and column three contains lines common to both files.

These columns are suppressable with -1, -2 and -3 respectively.

Example:

[root@dev ~]# cat a
common
shared
unique

[root@dev ~]# cat b
common
individual
shared

[root@dev ~]# comm -3 a b
    individual
unique

And if you just want the unique lines and don't care which file they're in:

[root@dev ~]# comm -3 a b | sed 's/^\t//'
individual
unique

As the man page says, the files must be sorted beforehand.

Solution 4:

Visual comparison tools fit two files together so that a segment with the same number of lines but differing content will be considered a changed segment. Completely new lines between matching segments are considered added segments.

This is also how sdiff command-line tool works, which shows a side-by-side comparison of two files in a terminal. Changed lines are separated by | character. If a line exists only in file A, < is used as the separator character. If a line exists only in file B, > is used as the separator. If you don't have < and > characters in the files, you can use this to show only added lines:

sdiff A B | grep '[<>]'