converting time that is an int in HHMM format and creating an hour field from it using pandas
I have a dataframe that looks like the following image.
This is what df.head().df.dict() produces:
The Time Occurred field is a time field in the HHMM format but datatype is int. I'd like to create a new field "Hour" from the "Time Occurred" field which only shows the hour. For instance, where '450' I'd have '4' in the hour field. I tried the following code:
s = df['Time Occurred'].astype(str)
df['Hour'] = pd.to_timedelta(s.str[:2] + ':' + s.str[2:], unit='h')
But I get the following error: ValueError: unit must not be specified if the input contains a str
I found several posts on stack that addressed extracting time from datetime but this isn't the issue I'm trying to solve for
One way using pandas.Series.str.zfill
:
s = pd.Series([1, 450, 2320, 545, 350, 2100])
pd.to_datetime(s.astype(str).str.zfill(4), format="%H%M")
Output:
0 1900-01-01 00:01:00
1 1900-01-01 04:50:00
2 1900-01-01 23:20:00
3 1900-01-01 05:45:00
4 1900-01-01 03:50:00
5 1900-01-01 21:00:00
Explain:
-
str.zfill(n)
pads zeros on the left of its inputs; so your ints become a string with at most 4 digits.0 0001 1 0450 2 2320 3 0545 4 0350 5 2100
-
pd.to_datetime
then uses the string to parse based onformat
, i.e. HHMM (%H%M
).