Timestamp subtraction must have the same timezones or no timezones but they are both UTC

There are questions that addresses the same error TypeError: Timestamp subtraction must have the same timezones or no timezones but none faces the same issue as this one.

I have 2 UTC Timestamps that throw that error when substracted.

print(date, type(date), date.tzinfo)
>>> 2020-07-17 00:00:00+00:00 <class 'pandas._libs.tslibs.timestamps.Timestamp'> UTC
print(date2, type(date2), date2.tzinfo)
>>> 2020-04-06 00:00:00.000000001+00:00 <class 'pandas._libs.tslibs.timestamps.Timestamp'> UTC
date - date2
>>> TypeError: Timestamp subtraction must have the same timezones or no timezones

Edit: I'm using Python 3.6.9 and Pandas 1.0.1


Solution 1:

Had same problem. If you're reading data using pandas read_csv, it uses <class 'pytz.UTC'>. So solution for me was to simply use same class everywhere.

Sample code generating error

from datetime import datetime, timedelta, timezone
import pandas as pd

now = datetime.now(tz=timezone.utc)
some_time_ago = now - timedelta(7)

print('Timezone info before reading_csv')
print(some_time_ago.tzinfo, type(some_time_ago.tzinfo))

time_passed = now - some_time_ago
print (time_passed)

df = pd.DataFrame([some_time_ago], columns=['date'])
df.to_csv('dates.csv', index=False)

df2 = pd.read_csv('dates.csv', parse_dates=['date'])
print('\nTimezone info after reading_csv')
print(df2.iloc[0,0].tzinfo, type(df2.iloc[0,0].tzinfo))

now = datetime.now(tz=timezone.utc)
some_time_ago = now - df2.iloc[0,0]
print(some_time_ago)
Timezone info before reading_csv
UTC <class 'datetime.timezone'>
7 days, 0:00:00

Timezone info after reading_csv
UTC <class 'pytz.UTC'>
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-b2815e32e8b7> in <module>
     19 
     20 now = datetime.now(tz=timezone.utc)
---> 21 some_time_ago = now - df2.iloc[0,0]
     22 print(some_time_ago)

pandas/_libs/tslibs/c_timestamp.pyx in pandas._libs.tslibs.c_timestamp._Timestamp.__sub__()

TypeError: Timestamp subtraction must have the same timezones or no timezones

Correct code with pytz

import pytz
from datetime import datetime, timedelta
import pandas as pd

now = datetime.now(tz=pytz.UTC)
some_time_ago = now - timedelta(7)

print('Timezone info before reading_csv')
print(some_time_ago.tzinfo, type(some_time_ago.tzinfo))

time_passed = now - some_time_ago
print (time_passed)

df = pd.DataFrame([some_time_ago], columns=['date'])
df.to_csv('dates.csv', index=False)

df2 = pd.read_csv('dates.csv', parse_dates=['date'])
print('\nTimezone info after reading_csv')
print(df2.iloc[0,0].tzinfo, type(df2.iloc[0,0].tzinfo))

now = datetime.now(tz=pytz.UTC)
some_time_ago = now - df2.iloc[0,0]
print(some_time_ago)
Timezone info before reading_csv
UTC <class 'pytz.UTC'>
7 days, 0:00:00

Timezone info after reading_csv
UTC <class 'pytz.UTC'>
7 days 00:00:00.024021

Solution 2:

After checking the timezone types: type(date.tzinfo) gives <class 'datetime.timezone'> and type(date2.tzinfo) gives <class 'pytz.UTC'> so acording of pandas source code they are not considered equal even even if they are both UTC.

So the solution was to make them have the same tzinfo type (either pytz or datitme.timezone)

This is an open issue in Github: https://github.com/pandas-dev/pandas/issues/32619