How to sort a DataFrame into bins, keeping the names for each bin?
I've got a simple DataFrame generated by the following code:
import pandas as pd
df = pd.DataFrame([[0.45, 0.34],
[0.51, 0.55],
[0.62, 0.48],
[0.71, 0.65],
[0.68, 0.79]],
columns = [0, 1],
index = list("ABCDE"))
print(df.to_string())
I'd like to turn it into 100 bins like this: 0 to 0.1, 0.1 to 0.2, ..., 0.9 to 1
for 1st column and the same for 2nd column (including the number on the left, but excluding the number on the right). In addition, I'd like each bin to contain the names of rows that fit into it. How can I do it using pandas or numpy?
IIUC you can simply use pd.cut
:
df.apply(pd.cut, bins=np.linspace(0, 1, 101))
Output:
0 1
A (0.44, 0.45] (0.33, 0.34]
B (0.5, 0.51] (0.54, 0.55]
C (0.61, 0.62] (0.47, 0.48]
D (0.7, 0.71] (0.64, 0.65]
E (0.67, 0.68] (0.78, 0.79]
Kinda the same as the previous answer, but:
pd.cut(df.unstack(),bins=100)