Pandas Pivot tables row subtotals

If you put State and City not both in the rows, you'll get separate margins. Reshape and you get the table you're after:

In [10]: table = pivot_table(df, values=['SalesToday', 'SalesMTD','SalesYTD'],\
                     rows=['State'], cols=['City'], aggfunc=np.sum, margins=True)


In [11]: table.stack('City')
Out[11]: 
            SalesMTD  SalesToday  SalesYTD
State City                                
stA   All        900          50      2100
      ctA        400          20      1000
      ctB        500          30      1100
stB   All        700          50      2200
      ctC        500          10       900
      ctD        200          40      1300
stC   All        300          30       800
      ctF        300          30       800
All   All       1900         130      5100
      ctA        400          20      1000
      ctB        500          30      1100
      ctC        500          10       900
      ctD        200          40      1300
      ctF        300          30       800

I admit this isn't totally obvious.

You can get the summarized values by using groupby() on the State column.

Lets make some sample data first:

import pandas as pd
import StringIO

incsv = StringIO.StringIO("""Date,State,City,SalesToday,SalesMTD,SalesYTD
20130320,stA,ctA,20,400,1000
20130320,stA,ctB,30,500,1100
20130320,stB,ctC,10,500,900
20130320,stB,ctD,40,200,1300
20130320,stC,ctF,30,300,800""")

df = pd.read_csv(incsv, index_col=['Date'], parse_dates=True)

Then apply the groupby function and add a column City:

dfsum = df.groupby('State', as_index=False).sum()
dfsum['City'] = 'All'

print dfsum

  State  SalesToday  SalesMTD  SalesYTD City
0   stA          50       900      2100  All
1   stB          50       700      2200  All
2   stC          30       300       800  All

We can append the original data to the summed df by using append:

dfsum.append(df).set_index(['State','City']).sort_index()

print dfsum

            SalesMTD  SalesToday  SalesYTD
State City                                
stA   All        900          50      2100
      ctA        400          20      1000
      ctB        500          30      1100
stB   All        700          50      2200
      ctC        500          10       900
      ctD        200          40      1300
stC   All        300          30       800
      ctF        300          30       800

I added the set_index and sort_index to make it look more like your example output, its not strictly necessary to get the results.

I Think this subtotal example code is what you want(similar to excel subtotal)

I assume that you want group by columns A, B, C, D, than count column value of E

main_df.groupby(['A', 'B', 'C']).apply(lambda sub_df: sub_df\
       .pivot_table(index=['D'], values=['E'], aggfunc='count', margins=True)

output:

A B C  D  E
       a  1 
a a a  b  2
       c  2
     all  5
       a  3 
b b a  b  2
       c  2
     all  7
       a  3 
b b b  b  6
       c  2
       d  3
     all 14

How about this one ?

table = pd.pivot_table(data, index=['State'],columns = ['City'],values=['SalesToday', 'SalesMTD','SalesYTD'],\
                      aggfunc=np.sum, margins=True)

enter image description here

Pandas Pivot tables row subtotals

Related

Recent Posts