Replacing values greater 1 in a large pandas dataframe
Solution 1:
You can try this.
Using boolean mask
df.set_index('Report No',inplace=True)
df[df > 1] = 1
df.reset_index()
Report No Apple Orange Lemon Grape Pear
One 1 0 1 1 1
Two 1 1 0 1 1
Three 0 0 1 1 1
Four 1 1 1 0 0
Five 1 0 0 1 1
Six 1 1 1 1 0
Or use this if you have some non numeric columns. No need to use set_index
and reset_index
. This is equivalent to df.select_dtypes('number')
val = df._get_numeric_data()
val[val > 1] = 1
df
Report No Apple Orange Lemon Grape Pear
One 1 0 1 1 1
Two 1 1 0 1 1
Three 0 0 1 1 1
Four 1 1 1 0 0
Five 1 0 0 1 1
Six 1 1 1 1 0
df.mask
df.set_index('Report No',inplace=True)
df.mask(df>1,1).reset_index()
Report No Apple Orange Lemon Grape Pear
One 1 0 1 1 1
Two 1 1 0 1 1
Three 0 0 1 1 1
Four 1 1 1 0 0
Five 1 0 0 1 1
Six 1 1 1 1 0
np.where
df[df.columns[1:]] = df.iloc[:,1:].where(df.iloc[:,1:] >1 ,1)
np.select
This can be helpful when dealing with multiple conditions. If you want to convert values less than 0 to 0 and values greater than 1 to 1.
df.set_index('Report No', inplace=True)
condlist = [df >= 1, df <= 0] #you can have more conditions and add choices accordingly.
choice = [1, 0] #len(condlist) should be equal to len(choice).
df.loc[:] = np.select(condlist, choice)
Like Jan mentioned use df.clip
Not recommended but you can try this for fun. Using df.astype
.
df.set_index('Report No',inplace=True)
df.astype('bool').astype('int')
NOTE: This will only convert falsy values to False
and truthy values to True
i.e. this will convert 0
to False
and anything other than 0
is True
even negative numbers.
s = pd.Series([1,-1,0])
s.astype('bool')
0 True
1 True
2 False
dtype: bool
s.astype('bool').astype('int')
0 1
1 1
2 0
dtype: int32
np.sign
When values present are between [0, n]
i.e no negative values.
df.loc[:] = np.sign(df)
Solution 2:
Use pandas.DataFrame.clip
:
new_df = df.clip(0, 1)
EDIT: To exclude the first column by name (this will edit the DataFrame in-place)
mask = df.columns != "Report No"
df.loc[:, mask] = df.loc[:, mask].clip(0, 1)