How can I extract records with a column in common?
I have two tables with different numbers of columns and rows. I want to find rows using a common column (column B is in common). Here is an example. Could you please help?
file1.txt
A B C D
a b c d
i ii iii iV
* ** # ##
file2.txt
E B
f ff
h b
g gg
k ii
output:
A B C D E
a b c d h
i ii iii iV k
Solution 1:
You can do this kind of thing by building a hash / associative array / lookup table e.g. using Awk:
$ awk 'NR==FNR{B[$2]=$1; next} $2 in B {print $0,B[$2]}' file2.txt file1.txt
A B C D E
a b c d h
i ii iii iV k
There's also the join
command - but that requires inputs to be sorted on the common field.
Solution 2:
To add to the answer by @steeldriver, to do it with sort
and join
:
join -j 2 -o 1.1,0,1.3,1.4,2.1 <(sort -k 2 file1.txt) <(sort -k 2 file2.txt)
-
-j 2
tellsjoin
which field is the key. -
-o
gives the order of the fields in the output, where0
is the common key, and the others areFILENUM.FIELD
. Seeman join
for details. -
-k 2
tellssort
which field is the key. -
<( )
isbash
process substitution.
Output is:
a b c d h
A B C D E
i ii iii iV k