Convert timedelta64[ns] column to seconds in Python Pandas DataFrame
Solution 1:
This works properly in the current version of Pandas (version 0.14):
In [132]: df[:5]['duration'] / np.timedelta64(1, 's')
Out[132]:
0 1232
1 1390
2 1495
3 797
4 1132
Name: duration, dtype: float64
Here is a workaround for older versions of Pandas/NumPy:
In [131]: df[:5]['duration'].values.view('<i8')/10**9
Out[131]: array([1232, 1390, 1495, 797, 1132], dtype=int64)
timedelta64 and datetime64 data are stored internally as 8-byte ints (dtype
'<i8'
). So the above views the timedelta64s as 8-byte ints and then does integer
division to convert nanoseconds to seconds.
Note that you need NumPy version 1.7 or newer to work with datetime64/timedelta64s.
Solution 2:
Use the Series dt accessor to get access to the methods and attributes of a datetime (timedelta) series.
>>> s
0 -1 days +23:45:14.304000
1 -1 days +23:46:57.132000
2 -1 days +23:49:25.913000
3 -1 days +23:59:48.913000
4 00:00:00.820000
dtype: timedelta64[ns]
>>>
>>> s.dt.total_seconds()
0 -885.696
1 -782.868
2 -634.087
3 -11.087
4 0.820
dtype: float64
There are other Pandas Series Accessors for String, Categorical, and Sparse data types.