Inner join on two text files
Looking to perform an inner join on two different text files. Basically I'm looking for the inner join equivalent of the GNU join program. Does such a thing exist? If not, an awk
or sed
solution would be most helpful, but my first choice would be a Linux command.
Here's an example of what I'm looking to do
file 1:
0|Alien Registration Card LUA|Checklist Update
1|Alien Registration Card LUA|Document App Plan
2|Alien Registration Card LUA|SA Application Nbr
3|Alien Registration Card LUA|tmp_preapp-DOB
0|App - CSCE Certificate LUA|Admit Type
1|App - CSCE Certificate LUA|Alias 1
2|App - CSCE Certificate LUA|Alias 2
3|App - CSCE Certificate LUA|Alias 3
4|App - CSCE Certificate LUA|Alias 4
file 2:
Alien Registration Card LUA
Results:
0|Alien Registration Card LUA|Checklist Update
1|Alien Registration Card LUA|Document App Plan
2|Alien Registration Card LUA|SA Application Nbr
3|Alien Registration Card LUA|tmp_preapp-DOB
Solution 1:
Here's an awk option, so you can avoid the bash dependency (for portability):
$ awk -F'|' 'NR==FNR{check[$0];next} $2 in check' file2 file1
How does this work?
-
-F'|'
-- sets the field separator -
'NR==FNR{check[$0];next}
-- if the total record number matches the file record number (i.e. we're reading the first file provided), then we populate an array and continue. -
$2 in check
-- If the second field was mentioned in the array we created, print the line (which is the default action if no actions are provided). -
file2 file1
-- the files. Order is important due to theNR==FNR
construct.
Solution 2:
Should not the file2 contain LUA
at the end?
If yes, you can still use join
:
join -t'|' -12 <(sort -t'|' -k2 file1) file2
Solution 3:
Looks like you just need
grep -F -f file2 file1