Add months to a date in Pandas
You could use pd.DateOffset
In [1756]: df.date + pd.DateOffset(months=plus_month_period)
Out[1756]:
0 2017-01-11
1 2017-02-01
Name: date, dtype: datetime64[ns]
Another way using pd.offsets.MonthOffset
In [1785]: df.date + pd.offsets.MonthOffset(plus_month_period)
Out[1785]:
0 2016-10-14
1 2016-11-04
Name: date, dtype: datetime64[ns]
Details
In [1757]: df
Out[1757]:
date
0 2016-10-11
1 2016-11-01
In [1758]: plus_month_period
Out[1758]: 3
Suppose you have a dataframe of the following format, where you have to add integer months to a date column.
Start_Date | Months_to_add |
---|---|
2014-06-01 | 23 |
2014-06-01 | 4 |
2000-10-01 | 10 |
2016-07-01 | 3 |
2017-12-01 | 90 |
2019-01-01 | 2 |
In such a scenario, using Zero's code or mattblack's code won't be useful. You have to use lambda function over the rows where the function takes 2 arguments -
- A date to which months need to be added to
- A month value in integer format
You can use the following function:
# Importing required modules
from dateutil.relativedelta import relativedelta
# Defining the function
def add_months(start_date, delta_period):
end_date = start_date + relativedelta(months=delta_period)
return end_date
After this you can use the following code snippet to add months to the Start_Date
column. Use progress_apply
functionality of Pandas. Refer to this Stackoverflow answer on progress_apply
: Progress indicator during pandas operations.
from tqdm import tqdm
tqdm.pandas()
df["End_Date"] = df.progress_apply(lambda row: add_months(row["Start_Date"], row["Months_to_add"]), axis = 1)
Here's the full code form dataset creation, for your reference:
import pandas as pd
from dateutil.relativedelta import relativedelta
from tqdm import tqdm
tqdm.pandas()
# Initilize a new dataframe
df = pd.DataFrame()
# Add Start Date column
df["Start_Date"] = ['2014-06-01T00:00:00.000000000',
'2014-06-01T00:00:00.000000000',
'2000-10-01T00:00:00.000000000',
'2016-07-01T00:00:00.000000000',
'2017-12-01T00:00:00.000000000',
'2019-01-01T00:00:00.000000000']
# To convert the date column to a datetime format
df["Start_Date"] = pd.to_datetime(df["Start_Date"])
# Add months column
df["Months_to_add"] = [23, 4, 10, 3, 90, 2]
# Defining the Add Months function
def add_months(start_date, delta_period):
end_date = start_date + relativedelta(months=delta_period)
return end_date
# Apply function on the dataframe using lambda operation.
df["End_Date"] = df.progress_apply(lambda row: add_months(row["Start_Date"], row["Months_to_add"]), axis = 1)
You will have the final output dataframe as follows.
Start_Date | Months_to_add | End_Date |
---|---|---|
2014-06-01 | 23 | 2016-05-01 |
2014-06-01 | 4 | 2014-10-01 |
2000-10-01 | 10 | 2001-08-01 |
2016-07-01 | 3 | 2016-10-01 |
2017-12-01 | 90 | 2025-06-01 |
2019-01-01 | 2 | 2019-03-01 |
Please add to comments if there are any issues with the above code.
All the best!