How to compare different columns from two Dataframes in Python

Solution 1:

You could extract the information into native python data structures and then merge it back with your original DataFrames -

To do this - I would first make pairs out of the Sender and Receiver columns in df2 -

def make_pairs(row):
    senders = row['Sender'].replace("[", "").replace("]", "").split(",")
    receivers = row['Receiver'].replace("[", "").replace("]", "").split(",")
    pairs = [(s, r) for s in senders for r in receivers]
    return pairs
send_receive_combinations = df2.apply(make_pairs, axis=1).to_dict()

Then map the combination of IDA and IDB from df1 into a dictionary:

rels = {(ida, idb): rel for ida, idb, rel in df1.values}

A dict comprehension (or even a simple for loop) can then be used to subset values of interest

rel_pairs = {key: rels[pair] for key, combination in send_receive_combinations.items() for pair in combination if pair in rels}

And finally, we can merge this dict with df2 -

df2['relationship'] = df2.index
df2['relationship'] = df2['relationship'].map(rel_pairs)
    Sender     Receiver relationship
#0  [A900,A200]  [A500,A220]       Spouse
#1  [A150,A100]       [A400]          NaN
#2  [A400,A112]       [A500]          NaN
#3  [A700,A112]  [A111,A001]       Parent