Pandas groupby for zero values

Solution 1:

You can use this:

df = df.groupby(['Symbol','Year']).count().unstack(fill_value=0).stack()
print (df)

Output:

             Action
Symbol Year        
AAPL   2001       2
       2002       0
BAC    2001       0
       2002       2

Solution 2:

You can use pivot_table with unstack:

print df.pivot_table(index='Symbol', 
                     columns='Year', 
                     values='Action',
                     fill_value=0, 
                     aggfunc='count').unstack()

Year  Symbol
2001  AAPL      2
      BAC       0
2002  AAPL      0
      BAC       2
dtype: int64

If you need output as DataFrame use to_frame:

print df.pivot_table(index='Symbol', 
                     columns='Year', 
                     values='Action',
                     fill_value=0, 
                     aggfunc='count').unstack()
                                     .to_frame()
                                     .rename(columns={0:'Action'})

             Action
Year Symbol        
2001 AAPL         2
     BAC          0
2002 AAPL         0
     BAC          2

Solution 3:

Datatype category

Maybe this feature didn't exist back when this thread was opened, however the datatype "category" can help here:

# create a dataframe with one combination of a,b missing
df = pd.DataFrame({"a":[0,1,1], "b": [0,1,0]})
df = df.astype({"a":"category", "b":"category"})
print(df)

Dataframe looks like this:

   a  b
0  0  0
1  1  1
2  1  0

And now, grouping by a and b

print(df.groupby(["a","b"]).size())

yields:

a  b
0  0    1
   1    0
1  0    1
   1    1

Note the 0 in the rightmost column. This behavior is also documented in the pandas userguide (search on page for "groupby").