group by week in pandas
I'm having this data frame:
Name Date Quantity
Apple 07/11/17 20
orange 07/14/17 20
Apple 07/14/17 70
Orange 07/25/17 40
Apple 07/20/17 30
I want to aggregate this by Name
and Date
to get sum of quantities
Details:
Date: Group, the result should be at the beginning of the week (or just on Monday)
Quantity: Sum, if two or more records have same Name and Date (if falls on same interval)
The desired output is given below:
Name Date Quantity
Apple 07/10/17 90
orange 07/10/17 20
Apple 07/17/17 30
orange 07/24/17 40
Thanks in advance
Solution 1:
First convert column date
to_datetime
and substract one week, as we want to sum for the week ahead of the date, not the week before that date.
Then use groupby
with Grouper
by W-MON and aggregate sum
:
df['Date'] = pd.to_datetime(df['Date']) - pd.to_timedelta(7, unit='d')
df = df.groupby(['Name', pd.Grouper(key='Date', freq='W-MON')])['Quantity']
.sum()
.reset_index()
.sort_values('Date')
print (df)
Name Date Quantity
0 Apple 2017-07-10 90
3 orange 2017-07-10 20
1 Apple 2017-07-17 30
2 Orange 2017-07-24 40
Solution 2:
Let's use groupby
, resample
with W-Mon
, and sum
:
df.groupby('Name').resample('W-Mon', on='Date').sum().reset_index().sort_values(by='Date')
Output:
Name Date Quantity
0 Apple 2017-07-17 90
3 orange 2017-07-17 20
1 Apple 2017-07-24 30
2 Orange 2017-07-31 40
Solution 3:
First convert column date to_datetime. This will group by week starting with Mondays. It will output the week number (but you can change that looking up in
http://strftime.org/
df.groupby(['name', df['date'].dt.strftime('%W')])['quantity'].sum()
Output:
name date
apple 28 90
29 30
orange 28 20
30 40
Solution 4:
This groups every row on the previous Monday (if the date is already Monday, nothing is changed). This has the effect of grouping by week:
import pandas as pd, datetime as dt
# df = ...
df['WeekDate'] = df.apply(lambda row: row['Date'] - dt.timedelta(days=row['Date'].weekday()), axis=1)
perweek = df['WeekDate'].groupby(df['WeekDate']).count()
Exemple:
Date WeekDate
2020-06-20 2020-06-15 <- monday
2020-06-21 2020-06-15
2020-06-24 2020-06-22 <- monday
2020-06-25 2020-06-22
2020-06-26 2020-06-22