Merging two Pandas dataframes based column values

I have two dataframes defined following way:

dataframe1 = pd.DataFrame( [["123", "CTR", "ABC", "DEF", "GHI"],
                            ["123", "RIGHT", "ABC", "DEF", "GHI"],
                            ["123", "LEFT", "ABC", "DEF", "GHI"],
                            ["456", "CTR", "JKL", "MNO", "PQR"],
                            ["456", "RIGHT", "JKL", "MNO", "PQR"],
                            ["456", "LEFT", "JKL", "MNO", "PQR"]],
    columns=["ID","LOCATION",
            "Attr1", "Attr2", "Attr3"],
    )

dataframe2 = pd.DataFrame( [["1", "A", "123"],
                            ["1", "B", "123"],
                            ["1", "C", "123"],
                            ["2", "A", "456"],
                            ["2", "B", "456"],
                            ["2", "C", "456"]],
    columns=["ROW","LOCATION","ID"],
    )

I would like to merge these two dataframes into a dataframe based on the ID column and the values of the Location column. In the location column, the A equals to CTR, B equals to RIGHT and C equals to LEFT. The result what I'm looking for would be like this:

    ID ROW LOCATION Attr1 Attr2 Attr3
0  123   1        A   ABC   DEF   GHI
1  123   1        B   ABC   DEF   GHI
2  123   1        C   ABC   DEF   GHI
3  456   2        A   JKL   MNO   PQR
4  456   2        B   JKL   MNO   PQR
5  456   2        C   JKL   MNO   PQR

Using pandas.merge() I can merge the dataframes using one or several columns, but I get a KeyError as the Location column values don't match.

Is pandas.merge() correct function to do this, and how can I define the matching column values using it?


Solution 1:

map and assign the values using a dictionary, then you can perform a simple merge:

d = {'CTR': 'A', 'RIGHT': 'B', 'LEFT': 'C'}

dataframe2.merge(dataframe1.assign(LOCATION=dataframe1['LOCATION'].map(d)),
                 on=['ID', 'LOCATION'])

output:

  ROW LOCATION   ID Attr1 Attr2 Attr3
0   1        A  123   ABC   DEF   GHI
1   1        B  123   ABC   DEF   GHI
2   1        C  123   ABC   DEF   GHI
3   2        A  456   JKL   MNO   PQR
4   2        B  456   JKL   MNO   PQR
5   2        C  456   JKL   MNO   PQR

Solution 2:

just map that to a dictionary of locations , there is no relation between what you want and dataframe2:

locations = { 'CTR':'A' , 'RIGHT' : 'B', 'LEFT' : 'C'}
dataframe1['LOCATION'] = dataframe1['LOCATION'].map(locations)

print(dataframe1)