Lambda including if...elif...else

I want to apply a lambda function to a DataFrame column using if...elif...else within the lambda function.

The df and the code are something like:

df=pd.DataFrame({"one":[1,2,3,4,5],"two":[6,7,8,9,10]})

df["one"].apply(lambda x: x*10 if x<2 elif x<4 x**2 else x+10)

Obviously, this doesn't work. Is there a way to apply if....elif....else to a lambda? How can I get the same result with List Comprehension?


Solution 1:

Nest if .. elses:

lambda x: x*10 if x<2 else (x**2 if x<4 else x+10)

Solution 2:

I do not recommend the use of apply here: it should be avoided if there are better alternatives.

For example, if you are performing the following operation on a Series:

if cond1:
    exp1
elif cond2:
    exp2
else:
    exp3

This is usually a good use case for np.where or np.select.


numpy.where

The if else chain above can be written using

np.where(cond1, exp1, np.where(cond2, exp2, ...))

np.where allows nesting. With one level of nesting, your problem can be solved with,

df['three'] = (
    np.where(
        df['one'] < 2, 
        df['one'] * 10, 
        np.where(df['one'] < 4, df['one'] ** 2, df['one'] + 10))
df

   one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

numpy.select

Allows for flexible syntax and is easily extensible. It follows the form,

np.select([cond1, cond2, ...], [exp1, exp2, ...])

Or, in this case,

np.select([cond1, cond2], [exp1, exp2], default=exp3)

df['three'] = (
    np.select(
        condlist=[df['one'] < 2, df['one'] < 4], 
        choicelist=[df['one'] * 10, df['one'] ** 2], 
        default=df['one'] + 10))
df

   one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

and/or (similar to the if/else)

Similar to if-else, requires the lambda:

df['three'] = df["one"].apply(
    lambda x: (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10) 

df
   one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

List Comprehension

Loopy solution that is still faster than apply.

df['three'] = [x*10 if x<2 else (x**2 if x<4 else x+10) for x in df['one']]
# df['three'] = [
#    (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10) for x in df['one']
# ]
df
   one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

Solution 3:

For readability I prefer to write a function, especially if you are dealing with many conditions. For the original question:

def parse_values(x):
    if x < 2:
       return x * 10
    elif x < 4:
       return x ** 2
    else:
       return x + 10

df['one'].apply(parse_values)