Python Pandas : pandas.to_datetime() is switching day & month when day is less than 13
Solution 1:
You can use the dayfirst
parameter in pd.to_datetime
.
pd.to_datetime(df.tmp, dayfirst=True)
Output:
0 2015-06-11 00:56:55
1 2015-06-11 04:16:38
2 2015-06-12 16:13:30
3 2015-06-12 21:24:03
4 2015-06-13 02:31:44
5 2015-06-13 02:37:49
Name: tmp, dtype: datetime64[ns]
Solution 2:
Well I solved my problem but in a memory consuming method, I split my tmp column first to a date and time columns then I re-split my date column to day month and year, that way I could look for the days that are less than 13 and replace them with the correspondent month
df['tmp'] = pd.to_datetime(df['tmp'], unit='ns')
df['tmp'] = df['tmp'].apply(lambda x: x.replace(microsecond=0))
df['date'] = [d.date() for d in df['tmp']]
df['time'] = [d.time() for d in df['tmp']]
df[['year','month','day']] = df['date'].apply(lambda x: pd.Series(x.strftime("%Y-%m-%d").split("-")))
df['day'] = pd.to_numeric(df['day'], errors='coerce')
df['month'] = pd.to_numeric(df['month'], errors='coerce')
df['year'] = pd.to_numeric(df['year'], errors='coerce')
#Loop to look for days less than 13 and then swap the day and month
for index, d in enumerate(df['day']):
if(d <13):
df.loc[index,'day'],df.loc[index,'month']=df.loc[index,'month'],df.loc[index,'day']
# convert series to string type in order to merge them
df['day'] = df['day'].astype(str)
df['month'] = df['month'].astype(str)
df['year'] = df['year'].astype(str)
df['date']= pd.to_datetime(df[['year', 'month', 'day']])
df['date'] = df['date'].astype(str)
df['time'] = df['time'].astype(str)
# merge time and date and place result in our column
df['tmp'] =pd.to_datetime(df['date']+ ' '+df['time'])
# drop the added columns
df.drop(df[['date','year', 'month', 'day','time']], axis=1, inplace = True)