How to join files on the command line without creating temp files?
I have two files in a Linux / Bash environment:
# Example data
$ cat abc
2 a
1 b
3 c
$ cat bcd
5 c
2 b
1 d
I'm trying to join the two files on the first column. The following does not work because the input files must be sorted on the match field.
# Wrong: join on unsorted input does not work
$ join abc bcd
I can get around this by creating two temp files and joining them
$ sort abc > temp1
$ sort bcd > temp2
$ join temp1 temp2
1 b d
2 a b
But is there a way to do this without creating temp files?
Solution 1:
The following will work in the bash shell:
# Join two files
$ join <(sort abc) <(sort bcd)
1 b d
2 a b
You can join on any column as long as you sort the input files on that column
# Join on the second field
$ join -j2 <(sort -k2 abc) <(sort -k2 bcd)
b 1 2
c 3 5
The -k2 argument to sort means sort on the second column. The -j2 argument to join means join on the second columns. Alternatively join -1 x -2 y file1 file2 will join on the xth column of file1 and the yth column of file2.
Solution 2:
Zsh answer:
join =(sort abc) =(sort bcd)
Solution 3:
This will work in bash shell:
# Join two files
$ sort abc | join - <(sort bcd)
1 b d
2 a b
OR
# Join two files
$ sort bcd | join <(sort abc) -
1 b d
2 a b
Because join can read standard input by using '-'.