Pandas Groupby then cumulative percentage of the groups
So my Dataset looks like this, I want to group my dataset by TSI column first and then find cum% of ONS for each grouped TSI
F TSI | F ONS | Cum%ONS |
---|---|---|
A1020 | 0.5 | |
A1020 | 1.3 | |
A1020 | 2.2 | |
A1020 | 3.1 | |
B1020 | 0.5 | |
B1020 | 1.4 | |
D1020 | 0.5 | |
D1020 | 1.4 |
How do I find cumulative percentage of each group so i get
F TSI | F ONS | Cum%ONS |
---|---|---|
A1020 | 0.5 | 7.142% |
A1020 | 1.3 | 25.71% |
A1020 | 2.2 | 57.13% |
A1020 | 3.0 | 100% |
B1020 | 0.5 | 26.315% |
B1020 | 1.4 | 100% |
D1020 | 0.5 | 26.3% |
D1020 | 1.4 | 100% |
a = df.groupby(['TSI'])['Final ONS'].cumsum()
b = df.groupby(['TSI'])['Final ONS'].sum()
g = df.groupby(['TSI'])['Final
ONS'].cumsum(axis=0)/df.groupby(['TSI'])['Final ONS'].sum()
g = pd.DataFrame(g)
g
This code gives me nan values
Here's one approach:
(i) groupby
"F_TSI" and find cumsum of "F_ONS" for each group
(ii) groupby
"F_TSI" and transform the sum of "F_ONS" for each group
(iii) divide (i) by (ii) for the percentages
gb_obj = df.groupby('F_TSI')['F_ONS']
df['Cum%ONS'] = gb_obj.cumsum() / gb_obj.transform('sum') * 100
Output:
F_TSI F_ONS Cum%ONS
0 A1020 0.5 7.042254
1 A1020 1.3 25.352113
2 A1020 2.2 56.338028
3 A1020 3.1 100.000000
4 B1020 0.5 26.315789
5 B1020 1.4 100.000000
6 D1020 0.5 26.315789
7 D1020 1.4 100.000000