Change value of a given column if another column contains the name of the given column

I am an intermediate learner and I have a pandas dataframe like below:

dfx=pd.DataFrame({'ID':['ID_1','ID_2','ID_3','ID_4'],'Extracts':[['QA,QB'], ['QB,QD'], ['QA,QD'], ['QC']],'QA':[0, 0, 0, 0],'QB':[0, 0, 0, 0],'QC':[0, 0, 0, 0],'QD':[0, 0, 0, 0]})

If any of the text in 'Extracts' column matches with the last four column names, I want the corresponding cells to be converted from 0 to 1 as shown in the following table: From this:

| ID | Extracts | QA | QB| QC|QD |
|----|:--------:|----|---|---|---|
|ID_1|['QA,QB'] |0   |0  |0  |0  |
|ID_2|['QB,QD'] |0   |0  |0  |0  |
|ID_3|['QA,QD'] |0   |0  |0  |0  |
|ID_4|['QC']    |0   |0  |0  |0  |

To this:

| ID | Extracts | QA | QB| QC|QD |
|----|:--------:|----|---|---|---|
|ID_1|['QA,QB'] |1   |1  |0  |0  |
|ID_2|['QB,QD'] |0   |1  |0  |1  |
|ID_3|['QA,QD'] |1   |0  |0  |1  |
|ID_4|['QC']    |0   |0  |1  |0  |

I have tried so far with the intent of looping through the columns:

for i in list(dfx.columns[2:6]):
    print(i)
    if dfx.Extracts.str.contains(i).any():
        dfx.i=1

But cannot get this working. I would appreciate it if someone could guide me through this. Many thanks in advance.


Solution 1:

We can use indexing with the str accessor to select the strings then use get_dummies to create a dataframe of indicator variables, finally update the original dataframe using the values from indicator dataframe

dfx.update(dfx['Extracts'].str[0].str.get_dummies(sep=','))

print(dfx)

     ID Extracts  QA  QB  QC  QD
0  ID_1  [QA,QB]   1   1   0   0
1  ID_2  [QB,QD]   0   1   0   1
2  ID_3  [QA,QD]   1   0   0   1
3  ID_4     [QC]   0   0   1   0