How can I use cumsum within a group in Pandas?

Solution 1:

You can call transform and pass the cumsum function to add that column to your df:

In [156]:
df['cumsum'] = df.groupby('id')['val'].transform(pd.Series.cumsum)
df

Out[156]:
  id   stuff  val  cumsum
0  A      12    1       1
1  B   23232    2       2
2  A      13   -3      -2
3  C    1234    1       1
4  D    3235    5       5
5  B    3236    6       8
6  C  732323   -2      -1

With respect to your error, you can't call cumsum on a Series groupby object, secondly you're passing the name of the column as a list which is meaningless.

So this works:

In [159]:
df.groupby('id')['val'].cumsum()

Out[159]:
0    1
1    2
2   -2
3    1
4    5
5    8
6   -1
dtype: int64