How to parse dates with -0400 timezone string in Python?

I have a date string of the form '2009/05/13 19:19:30 -0400'. It seems that previous versions of Python may have supported a %z format tag in strptime for the trailing timezone specification, but 2.6.x seems to have removed that.

What's the right way to parse this string into a datetime object?


Solution 1:

You can use the parse function from dateutil:

>>> from dateutil.parser import parse
>>> d = parse('2009/05/13 19:19:30 -0400')
>>> d
datetime.datetime(2009, 5, 13, 19, 19, 30, tzinfo=tzoffset(None, -14400))

This way you obtain a datetime object you can then use.

As answered, dateutil2.0 is written for Python 3.0 and does not work with Python 2.x. For Python 2.x dateutil1.5 needs to be used.

Solution 2:

%z is supported in Python 3.2+:

>>> from datetime import datetime
>>> datetime.strptime('2009/05/13 19:19:30 -0400', '%Y/%m/%d %H:%M:%S %z')
datetime.datetime(2009, 5, 13, 19, 19, 30,
                  tzinfo=datetime.timezone(datetime.timedelta(-1, 72000)))

On earlier versions:

from datetime import datetime

date_str = '2009/05/13 19:19:30 -0400'
naive_date_str, _, offset_str = date_str.rpartition(' ')
naive_dt = datetime.strptime(naive_date_str, '%Y/%m/%d %H:%M:%S')
offset = int(offset_str[-4:-2])*60 + int(offset_str[-2:])
if offset_str[0] == "-":
   offset = -offset
dt = naive_dt.replace(tzinfo=FixedOffset(offset))
print(repr(dt))
# -> datetime.datetime(2009, 5, 13, 19, 19, 30, tzinfo=FixedOffset(-240))
print(dt)
# -> 2009-05-13 19:19:30-04:00

where FixedOffset is a class based on the code example from the docs:

from datetime import timedelta, tzinfo

class FixedOffset(tzinfo):
    """Fixed offset in minutes: `time = utc_time + utc_offset`."""
    def __init__(self, offset):
        self.__offset = timedelta(minutes=offset)
        hours, minutes = divmod(offset, 60)
        #NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
        #  that have the opposite sign in the name;
        #  the corresponding numeric value is not used e.g., no minutes
        self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
    def utcoffset(self, dt=None):
        return self.__offset
    def tzname(self, dt=None):
        return self.__name
    def dst(self, dt=None):
        return timedelta(0)
    def __repr__(self):
        return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)

Solution 3:

Here is a fix of the "%z" issue for Python 2.7 and earlier

Instead of using:

datetime.strptime(t,'%Y-%m-%dT%H:%M %z')

Use the timedelta to account for the timezone, like this:

from datetime import datetime,timedelta
def dt_parse(t):
    ret = datetime.strptime(t[0:16],'%Y-%m-%dT%H:%M')
    if t[18]=='+':
        ret-=timedelta(hours=int(t[19:22]),minutes=int(t[23:]))
    elif t[18]=='-':
        ret+=timedelta(hours=int(t[19:22]),minutes=int(t[23:]))
    return ret

Note that the dates would be converted to GMT, which would allow doing date arithmetic without worrying about time zones.