Applying function based on condition on pandas dataframe series
I am new to Pandas
My dataframe:
df
A B
first True
second False
third False
fourth True
fifth False
Desired output
A B C
first True en
second False
third False
fourth True en
fifth False
I am trying to apply a function to column C
only when the B
column is True
.
What I use
if (df['B'] == True)):
df['C'] = df['A'].apply(
lambda x: TextBlob(x).detect_language())
But I get an error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What I've tried
df['B'].bool()
df['B'] is True
df['B'] == 'True'
But the error persists, not sure how I would form a statement saying 'only where column B is True'.
Thank you for your suggestions.
Solution 1:
If want missing values for no matched rows filter rows before apply
for processing only rows with True
s:
df['C'] = df.loc[df['B'], 'A'].apply(lambda x: TextBlob(x).detect_language())
print (df)
A B C
0 first True en
1 second False NaN
2 third False NaN
3 fourth True en
4 fifth False NaN
Or if need empty strings for non matched values, but apply
processing all columns:
df['C'] = np.where(df['B'], df['A'].apply(lambda x: TextBlob(x).detect_language()), '')
print (df)
A B C
0 first True en
1 second False
2 third False
3 fourth True en
4 fifth False