Pandas counting and summing specific conditions
You can first make a conditional selection, and sum up the results of the selection using the sum
function.
>> df = pd.DataFrame({'a': [1, 2, 3]})
>> df[df.a > 1].sum()
a 5
dtype: int64
Having more than one condition:
>> df[(df.a > 1) & (df.a < 3)].sum()
a 2
dtype: int64
If you want to do COUNTIF
, just replace sum()
with count()
You didn't mention the fancy indexing capabilities of dataframes, e.g.:
>>> df = pd.DataFrame({"class":[1,1,1,2,2], "value":[1,2,3,4,5]})
>>> df[df["class"]==1].sum()
class 3
value 6
dtype: int64
>>> df[df["class"]==1].sum()["value"]
6
>>> df[df["class"]==1].count()["value"]
3
You could replace df["class"]==1
by another condition.
I usually use numpy sum over the logical condition column:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({'Age' : [20,24,18,5,78]})
>>> np.sum(df['Age'] > 20)
2
This seems to me slightly shorter than the solution presented above