Pandas apply but only for rows where a condition is met
The other answers are excellent, but I thought I'd add one other approach that can be faster in some circumstances – using broadcasting and masking to achieve the same result:
import numpy as np
mask = (z['b'] != 0)
z_valid = z[mask]
z['c'] = 0
z.loc[mask, 'c'] = z_valid['a'] / np.log(z_valid['b'])
Especially with very large dataframes, this approach will generally be faster than solutions based on apply()
.
You can just use an if statement in a lambda function.
z['c'] = z.apply(lambda row: 0 if row['b'] in (0,1) else row['a'] / math.log(row['b']), axis=1)
I also excluded 1, because log(1) is zero.
Output:
a b c
0 4 6 2.232443
1 5 0 0.000000
2 6 5 3.728010
3 7 0 0.000000
4 8 1 0.000000
Hope this helps. It is easy and readable
df['c']=df['b'].apply(lambda x: 0 if x ==0 else math.log(x))
You can use a lambda with a conditional to return 0 if the input value is 0 and skip the whole where
clause:
z['c'] = z.apply(lambda x: math.log(x.b) if x.b > 0 else 0, axis=1)
You also have to assign the results to a new column (z['c']
).