How to join two CSV files?

Suppose you have one CSV file with 2 fields: ID and email. You have another file with 2 fields: email and name. How can you produce a file with all three fields joined on email?


Revision3:

You must sort both lists on email alphabetically, then join. Given that the email field the 2nd field of file1 and the 1st field of file2:

sort -t , -k 2,2 file1.csv > sort1.csv
sort -t , -k 1,1 file2.csv > sort2.csv
join -t , -1 2 -2 1 sort1.csv sort2.csv > sort3.csv

parameter meaning

-t ,   : ',' is the field separator
-k 2,2 : character sort on 2nd field
-k 1,1 : character sort on 1st field
-1 2   : file 1, 2nd field
-2 1   : file 2, 1st field
>      : output to file

produces

email,ID,name
email,ID,name
...

sorted by email alphabetically.

Note that if any email is missing from either file it will be omitted from the results.


Use csvkit:

csvjoin -c email id_email.csv email_name.csv

or

csvjoin -c 2,1 id_email.csv email_name.csv