Group dataframe and get sum AND count?
I have a dataframe that looks like this:
Company Name Organisation Name Amount
10118 Vifor Pharma UK Ltd Welsh Assoc for Gastro & Endo 2700.00
10119 Vifor Pharma UK Ltd Welsh IBD Specialist Group, 169.00
10120 Vifor Pharma UK Ltd West Midlands AHSN 1200.00
10121 Vifor Pharma UK Ltd Whittington Hospital 63.00
10122 Vifor Pharma UK Ltd Ysbyty Gwynedd 75.93
How do I sum the Amount
and count the Organisation Name
, to get a new dataframe that looks like this?
Company Name Organisation Count Amount
10118 Vifor Pharma UK Ltd 5 11000.00
I know how to sum or count:
df.groupby('Company Name').sum()
df.groupby('Company Name').count()
But not how to do both!
Solution 1:
try this:
In [110]: (df.groupby('Company Name')
.....: .agg({'Organisation Name':'count', 'Amount': 'sum'})
.....: .reset_index()
.....: .rename(columns={'Organisation Name':'Organisation Count'})
.....: )
Out[110]:
Company Name Amount Organisation Count
0 Vifor Pharma UK Ltd 4207.93 5
or if you don't want to reset index:
df.groupby('Company Name')['Amount'].agg(['sum','count'])
or
df.groupby('Company Name').agg({'Amount': ['sum','count']})
Demo:
In [98]: df.groupby('Company Name')['Amount'].agg(['sum','count'])
Out[98]:
sum count
Company Name
Vifor Pharma UK Ltd 4207.93 5
In [99]: df.groupby('Company Name').agg({'Amount': ['sum','count']})
Out[99]:
Amount
sum count
Company Name
Vifor Pharma UK Ltd 4207.93 5
Solution 2:
Just in case you were wondering how to rename columns during aggregation, here's how for
pandas >= 0.25: Named Aggregation
df.groupby('Company Name')['Amount'].agg(MySum='sum', MyCount='count')
Or,
df.groupby('Company Name').agg(MySum=('Amount', 'sum'), MyCount=('Amount', 'count'))
MySum MyCount
Company Name
Vifor Pharma UK Ltd 4207.93 5