Pandas how can 'replace' work after 'loc'?
I have tried many times, but seems the 'replace' can NOT work well after use 'loc'. For example I want to replace the 'conlumn_b' with an regex for the row that the 'conlumn_a' value is 'apple'.
Here is my sample code :
df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'].replace(r'^11*', 'XXX',inplace=True, regex=True)
Example:
conlumn_a conlumn_b
apple 123
banana 11
apple 11
orange 33
The result that I expected for the 'df' is:
conlumn_a conlumn_b
apple 123
banana 11
apple XXX
orange 33
Anyone has meet this issue that needs 'replace' with regex after 'loc' ?
OR you guys has some other good solutions ?
Thank you so much for your help!
Solution 1:
inplace=True
works on the object that it was applied on.
When you call .loc
, you're slicing your dataframe object to return a new one.
>>> id(df)
4587248608
And,
>>> id(df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'])
4767716968
Now, calling an in-place replace
on this new slice will apply the replace operation, updating the new slice itself, and not the original.
Now, note that you're calling replace
on a column of int
, and nothing is going to happen, because regular expressions work on strings.
Here's what I offer you as a workaround. Don't use regex at all.
m = df['conlumn_a'] == 'apple'
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b'].replace(11, 'XXX')
df
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
Or, if you need regex based substitution, then -
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b']\
.astype(str).replace('^11$', 'XXX', regex=True)
Although, this converts your column to an object column.
Solution 2:
I'm going to borrow from a recent answer of mine. This technique is a general purpose strategy for updating a dataframe in place:
df.update(
df.loc[df['conlumn_a'] == 'apple', 'conlumn_b']
.replace(r'^11$', 'XXX', regex=True)
)
df
conlumn_a conlumn_b
0 apple 123
1 banana 11
2 apple XXX
3 orange 33
Note that all I did was remove the inplace=True
and instead wrapped it in the pd.DataFrame.update
method.