Not able to apply lambda function to a dataset
The dataset contains a columns Pclass with values(1, 2, 3) and Age.The Age column has some null values. I want to replace those null values with median age of people in different class. Median age of people in 1st class is 37, 2nd class is 29 and 3rd class is 24.
So here is the code of what I am trying to do:
def fill_age(x):
if pd.isna(x['Age']) and x['Pclass'] == 1:
return 37
elif pd.isna(x['Age']) and x['Pclass'] == 2:
return 29
elif pd.isna(x['Age']) and x['pclass'] == 3:
return 24
else:
return x['Age']
df['Age'] = df.apply(fill_age)
But this is the error I am getting:
KeyError Traceback (most recent call last)
<ipython-input-126-7375a6b3c119> in <module>
----> 1 df['Age'] = df.apply(fill_age)
KeyError: 'Age'
Please let me know what I am doing wrong. Thankyou in advance.
Use DataFrame.apply
per axis=1
:
df['Age'] = df.apply(fill_age, axis=1)
For vectorized (faster) alternative use Series.fillna
with mapping by Series.map
by dictionary:
df['Age'] = df['Age'].fillna(df['Pclass'].map({1:37,2:29,3:24}))