Timestamp subtraction must have the same timezones or no timezones but they are both UTC
There are questions that addresses the same error TypeError: Timestamp subtraction must have the same timezones or no timezones
but none faces the same issue as this one.
I have 2 UTC Timestamps that throw that error when substracted.
print(date, type(date), date.tzinfo)
>>> 2020-07-17 00:00:00+00:00 <class 'pandas._libs.tslibs.timestamps.Timestamp'> UTC
print(date2, type(date2), date2.tzinfo)
>>> 2020-04-06 00:00:00.000000001+00:00 <class 'pandas._libs.tslibs.timestamps.Timestamp'> UTC
date - date2
>>> TypeError: Timestamp subtraction must have the same timezones or no timezones
Edit: I'm using Python 3.6.9 and Pandas 1.0.1
Solution 1:
Had same problem. If you're reading data using pandas read_csv
, it uses <class 'pytz.UTC'>
. So solution for me was to simply use same class everywhere.
Sample code generating error
from datetime import datetime, timedelta, timezone
import pandas as pd
now = datetime.now(tz=timezone.utc)
some_time_ago = now - timedelta(7)
print('Timezone info before reading_csv')
print(some_time_ago.tzinfo, type(some_time_ago.tzinfo))
time_passed = now - some_time_ago
print (time_passed)
df = pd.DataFrame([some_time_ago], columns=['date'])
df.to_csv('dates.csv', index=False)
df2 = pd.read_csv('dates.csv', parse_dates=['date'])
print('\nTimezone info after reading_csv')
print(df2.iloc[0,0].tzinfo, type(df2.iloc[0,0].tzinfo))
now = datetime.now(tz=timezone.utc)
some_time_ago = now - df2.iloc[0,0]
print(some_time_ago)
Timezone info before reading_csv
UTC <class 'datetime.timezone'>
7 days, 0:00:00
Timezone info after reading_csv
UTC <class 'pytz.UTC'>
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-23-b2815e32e8b7> in <module>
19
20 now = datetime.now(tz=timezone.utc)
---> 21 some_time_ago = now - df2.iloc[0,0]
22 print(some_time_ago)
pandas/_libs/tslibs/c_timestamp.pyx in pandas._libs.tslibs.c_timestamp._Timestamp.__sub__()
TypeError: Timestamp subtraction must have the same timezones or no timezones
Correct code with pytz
import pytz
from datetime import datetime, timedelta
import pandas as pd
now = datetime.now(tz=pytz.UTC)
some_time_ago = now - timedelta(7)
print('Timezone info before reading_csv')
print(some_time_ago.tzinfo, type(some_time_ago.tzinfo))
time_passed = now - some_time_ago
print (time_passed)
df = pd.DataFrame([some_time_ago], columns=['date'])
df.to_csv('dates.csv', index=False)
df2 = pd.read_csv('dates.csv', parse_dates=['date'])
print('\nTimezone info after reading_csv')
print(df2.iloc[0,0].tzinfo, type(df2.iloc[0,0].tzinfo))
now = datetime.now(tz=pytz.UTC)
some_time_ago = now - df2.iloc[0,0]
print(some_time_ago)
Timezone info before reading_csv
UTC <class 'pytz.UTC'>
7 days, 0:00:00
Timezone info after reading_csv
UTC <class 'pytz.UTC'>
7 days 00:00:00.024021
Solution 2:
After checking the timezone types: type(date.tzinfo)
gives <class 'datetime.timezone'>
and type(date2.tzinfo)
gives <class 'pytz.UTC'>
so acording of pandas source code they are not considered equal even even if they are both UTC.
So the solution was to make them have the same tzinfo type (either pytz or datitme.timezone)
This is an open issue in Github: https://github.com/pandas-dev/pandas/issues/32619