Create a boolean column in pandas datafame based on percentile values of another column

I have a dataframe with multiple columns. I want to create boolean column, flagging if the value belongs to 90th percentile and above.

My data frame also contains multiple zeros.

Example:

Name  Value
Val1  1000
Val2  910
Val3  800
Val4  700
Val5  600
Val6  500
Val7  400
Val8  300
Val9  200
Val10 100
Val11 0

Expected output

Name  Value 90thper
Val1  1000    1
Val2  910     1
Val3  800     0
Val4  700     0
Val5  600     0
Val6  500     0
Val7  400     0
Val8  300     0
Val9  200     0
Val10 100     0
Val11 0       0

Solution 1:

You could use pd.Series.quantile to find the 90th percentile value and include all values above it.

val = df['Value'].quantile(.9, interpolation="lower") # val -> 910
df['90thper'] = df['Value'].ge(val).astype(int)

#     Name  Value  90thper
# 0   Val1   1000        1
# 1   Val2    910        1
# 2   Val3    800        0
# 3   Val4    700        0
# 4   Val5    600        0
# 5   Val6    500        0
# 6   Val7    400        0
# 7   Val8    300        0
# 8   Val9    200        0
# 9  Val10    100        0

Solution 2:

Try quantile

df['new'] = df.Value.ge(df.Value.quantile(0.9)).astype(int)
df
     Name  Value  new
0    Val1   1000    1
1    Val2    910    1
2    Val3    800    0
3    Val4    700    0
4    Val5    600    0
5    Val6    500    0
6    Val7    400    0
7    Val8    300    0
8    Val9    200    0
9   Val10    100    0
10  Val11      0    0