Wan to get group by date and status from the data
You can convert your 'created_at' column to datetime using pd.to_datetime
, filter for your 'success' rows, and then use groupby
per month with the aggregation of count
:
df['created_at'] = pd.to_datetime(df['created_at'])
out = df.loc[df.status.eq('success')].groupby(df.created_at.dt.month).status.count()
prints:
created_at success
0 1 2
1 12 1
# as a dict
>>> out.set_index('created_at').to_dict()
{'success': {1: 2, 12: 1}}
Then you can just convert the month numbers to month names as shown here
You have to parse the date using datetime.strptime
first, and use itetools.groupby
to group based on month.
>>> from datetime import datetime
>>> import calendar
>>>
>>> from itertools import groupby
>>>
>>> data = [
... {"name": "aa", "created_at": "2022-01-17 07:38:26.403Z", "status": "success"},
... {"name": "bb", "created_at": "2021-12-1 07:38:26.403Z", "status": "failed"},
... {"name": "kk", "created_at": "2022-01-13 07:38:26.403Z", "status": "success"},
... {"name": "ll", "created_at": "2021-12-17 07:38:26.403Z", "status": "success"},
... ]
>>>
>>>
>>> def custom_key_fun(row):
... created_at_as_datetime = datetime.strptime(
... row["created_at"], "%Y-%m-%d %H:%M:%S.%fZ"
... )
... month = calendar.month_abbr[created_at_as_datetime.month]
... return month
...
>>>
>>> result = {
... key: sum(1 for row in group if row["status"] == "success")
... for key, group in groupby(sorted(data, key=custom_key_fun), key=custom_key_fun)
... }
>>>
>>> print(result)
{'Dec': 1, 'Jan': 2}
Load Pandas
import pandas as pd
from datetime import datetime
data = [{"name": "aa", "created_at": "2022-01-17 07:38:26.403Z", "status": "success"},
{"name": "bb", "created_at": "2021-12-1 07:38:26.403Z", "status": "failed"},
{"name": "kk", "created_at": "2022-01-13 07:38:26.403Z", "status": "success"},
{"name": "ll", "created_at": "2021-12-17 07:38:26.403Z", "status": "success"},
]
df = pd.DataFrame(data)
df
name created_at status
0 aa 2022-01-17 07:38:26.403Z success
1 bb 2021-12-1 07:38:26.403Z failed
2 kk 2022-01-13 07:38:26.403Z success
3 ll 2021-12-17 07:38:26.403Z success
Build new column
def date_convert(x):
x = x.split('.')[0]
return datetime.strptime(x, "%Y-%m-%d %H:%M:%S").strftime("%B")
df['month_name'] = df.created_at.apply(date_convert)
df
name created_at status month_name
0 aa 2022-01-17 07:38:26.403Z success January
1 bb 2021-12-1 07:38:26.403Z failed December
2 kk 2022-01-13 07:38:26.403Z success January
3 ll 2021-12-17 07:38:26.403Z success December
Calculate & Build New DataFrame
group_month = df[df.status == "success"].groupby('month_name').groups
data = [{"created_at": i, "success": len(j)} for i, j in group_month.items()]
new_df = pd.DataFrame(data)
new_df
created_at success
0 December 1
1 January 2