Python Pandas Groupby isin
Need filter as first step by boolean indexing
:
Sample:
df = pd.DataFrame({'Teams': ['Green', 'Blue', 'Red', 'Orange', 'Green', 'Blue', 'Grey', 'Purple'],
'Revenue': [18, 15, 20, 5, 10, 15, 2, 5],
'Location': ['A', 'B', 'V', 'G', 'A', 'D', 'B', 'C']})
print (df)
Teams Revenue Location
0 Green 18 A
1 Blue 15 B
2 Red 20 V
3 Orange 5 G
4 Green 10 A
5 Blue 15 D
6 Grey 2 B
7 Purple 5 C
First get top values and select column Teams
:
Rev = df.nlargest(3,'Revenue')['Teams']
print (Rev)
2 Red
0 Green
1 Blue
Name: Teams, dtype: object
Then need filter first by boolean indexing
:
print (df[df['Teams'].isin(Rev)])
Teams Revenue Location
0 Green 18 A
1 Blue 15 B
2 Red 20 V
4 Green 10 A
5 Blue 15 D
df1 = (df[df['Teams'].isin(Rev)]
.groupby('Teams',as_index=False)['Revenue']
.sum()
.sort_values('Revenue', ascending=False))
print (df1)
Teams Revenue
0 Blue 30
1 Green 28
2 Red 20
If need multiple columns to output is necessary set aggregation function for each of them like:
df2 = (df[df['Teams'].isin(Rev)]
.groupby('Teams',as_index=False)
.agg({'Revenue':'sum', 'Location': ', '.join, 'Another col':'mean'}))
print (df2)
Teams Revenue Location
0 Blue 30 B, D
1 Green 28 A, A
2 Red 20 V