Merging two Pandas dataframes based column values
I have two dataframes defined following way:
dataframe1 = pd.DataFrame( [["123", "CTR", "ABC", "DEF", "GHI"],
["123", "RIGHT", "ABC", "DEF", "GHI"],
["123", "LEFT", "ABC", "DEF", "GHI"],
["456", "CTR", "JKL", "MNO", "PQR"],
["456", "RIGHT", "JKL", "MNO", "PQR"],
["456", "LEFT", "JKL", "MNO", "PQR"]],
columns=["ID","LOCATION",
"Attr1", "Attr2", "Attr3"],
)
dataframe2 = pd.DataFrame( [["1", "A", "123"],
["1", "B", "123"],
["1", "C", "123"],
["2", "A", "456"],
["2", "B", "456"],
["2", "C", "456"]],
columns=["ROW","LOCATION","ID"],
)
I would like to merge these two dataframes into a dataframe based on the ID column and the values of the Location column. In the location column, the A equals to CTR, B equals to RIGHT and C equals to LEFT. The result what I'm looking for would be like this:
ID ROW LOCATION Attr1 Attr2 Attr3
0 123 1 A ABC DEF GHI
1 123 1 B ABC DEF GHI
2 123 1 C ABC DEF GHI
3 456 2 A JKL MNO PQR
4 456 2 B JKL MNO PQR
5 456 2 C JKL MNO PQR
Using pandas.merge() I can merge the dataframes using one or several columns, but I get a KeyError as the Location column values don't match.
Is pandas.merge() correct function to do this, and how can I define the matching column values using it?
Solution 1:
map
and assign
the values using a dictionary, then you can perform a simple merge
:
d = {'CTR': 'A', 'RIGHT': 'B', 'LEFT': 'C'}
dataframe2.merge(dataframe1.assign(LOCATION=dataframe1['LOCATION'].map(d)),
on=['ID', 'LOCATION'])
output:
ROW LOCATION ID Attr1 Attr2 Attr3
0 1 A 123 ABC DEF GHI
1 1 B 123 ABC DEF GHI
2 1 C 123 ABC DEF GHI
3 2 A 456 JKL MNO PQR
4 2 B 456 JKL MNO PQR
5 2 C 456 JKL MNO PQR
Solution 2:
just map that to a dictionary of locations , there is no relation between what you want and dataframe2:
locations = { 'CTR':'A' , 'RIGHT' : 'B', 'LEFT' : 'C'}
dataframe1['LOCATION'] = dataframe1['LOCATION'].map(locations)
print(dataframe1)