I'm having this data frame:

Name   Date    Quantity
Apple  07/11/17  20
orange 07/14/17  20
Apple  07/14/17  70
Orange 07/25/17  40
Apple  07/20/17  30

I want to aggregate this by Name and Date to get sum of quantities Details:

Date: Group, the result should be at the beginning of the week (or just on Monday)

Quantity: Sum, if two or more records have same Name and Date (if falls on same interval)

The desired output is given below:

Name   Date    Quantity
Apple  07/10/17  90
orange 07/10/17  20
Apple  07/17/17  30
orange 07/24/17  40

Thanks in advance


Solution 1:

First convert column date to_datetime and substract one week, as we want to sum for the week ahead of the date, not the week before that date.

Then use groupby with Grouper by W-MON and aggregate sum:

df['Date'] = pd.to_datetime(df['Date']) - pd.to_timedelta(7, unit='d')
df = df.groupby(['Name', pd.Grouper(key='Date', freq='W-MON')])['Quantity']
       .sum()
       .reset_index()
       .sort_values('Date')
print (df)
     Name       Date  Quantity
0   Apple 2017-07-10        90
3  orange 2017-07-10        20
1   Apple 2017-07-17        30
2  Orange 2017-07-24        40

Solution 2:

Let's use groupby, resample with W-Mon, and sum:

df.groupby('Name').resample('W-Mon', on='Date').sum().reset_index().sort_values(by='Date')

Output:

     Name       Date  Quantity
0   Apple 2017-07-17        90
3  orange 2017-07-17        20
1   Apple 2017-07-24        30
2  Orange 2017-07-31        40

Solution 3:

First convert column date to_datetime. This will group by week starting with Mondays. It will output the week number (but you can change that looking up in

http://strftime.org/

df.groupby(['name', df['date'].dt.strftime('%W')])['quantity'].sum()

Output:

name    date
apple   28      90
        29      30
orange  28      20
        30      40

Solution 4:

This groups every row on the previous Monday (if the date is already Monday, nothing is changed). This has the effect of grouping by week:

import pandas as pd, datetime as dt

# df = ...

df['WeekDate'] = df.apply(lambda row: row['Date'] - dt.timedelta(days=row['Date'].weekday()), axis=1)

perweek = df['WeekDate'].groupby(df['WeekDate']).count()

Exemple:

Date           WeekDate
2020-06-20     2020-06-15 <- monday
2020-06-21     2020-06-15
2020-06-24     2020-06-22 <- monday 
2020-06-25     2020-06-22
2020-06-26     2020-06-22