How do I create a new dataframe column based on two other columns?
Solution 1:
pandas uses bitwise operation (& |) and each condition should be wrapped in a parenthesis, otherwise the error will be raise.
Try wrapping each condition with () like (df['cat_1l'] >= 5) & (...)
to see if error goes away.
However, your operation can be simplified with between
function.
df['[5-10]'] = (df.cat_1.between(5, 10) & df.cat_2.between(5, 10)).astype(int)
Solution 2:
The reason why you're getting an error is that evaluation of &
has priority over >=
. To fix your snippet, add parentheses around column comparisons:
df.loc[((df['cat_1l'] >= 5) & (df['cat_1'] <= 10)
& (df['cat_2'] >= 5) & (result['cat_2'] <= 10)), '[5-10]' = 1
Even better, it is preferred to define the new column as a whole, without subsetting using .loc
. Consider e.g.:
df['[5-10]'] = df['cat1'].between(5, 10) & df['cat_2'].between(5, 10)