Python Pandas : pandas.to_datetime() is switching day & month when day is less than 13

Solution 1:

You can use the dayfirst parameter in pd.to_datetime.

pd.to_datetime(df.tmp, dayfirst=True)

Output:

0   2015-06-11 00:56:55
1   2015-06-11 04:16:38
2   2015-06-12 16:13:30
3   2015-06-12 21:24:03
4   2015-06-13 02:31:44
5   2015-06-13 02:37:49
Name: tmp, dtype: datetime64[ns]

Solution 2:

Well I solved my problem but in a memory consuming method, I split my tmp column first to a date and time columns then I re-split my date column to day month and year, that way I could look for the days that are less than 13 and replace them with the correspondent month

df['tmp'] = pd.to_datetime(df['tmp'], unit='ns')
df['tmp'] = df['tmp'].apply(lambda x: x.replace(microsecond=0))
df['date'] = [d.date() for d in df['tmp']]
df['time'] = [d.time() for d in df['tmp']]
df[['year','month','day']] = df['date'].apply(lambda x: pd.Series(x.strftime("%Y-%m-%d").split("-")))

df['day'] = pd.to_numeric(df['day'], errors='coerce')
df['month'] = pd.to_numeric(df['month'], errors='coerce')
df['year'] = pd.to_numeric(df['year'], errors='coerce')


#Loop to look for days less than 13 and then swap the day and month
for index, d in enumerate(df['day']):
        if(d <13): 
 df.loc[index,'day'],df.loc[index,'month']=df.loc[index,'month'],df.loc[index,'day'] 

# convert series to string type in order to merge them

 df['day'] = df['day'].astype(str)
 df['month'] = df['month'].astype(str)
 df['year'] = df['year'].astype(str)
 df['date']=  pd.to_datetime(df[['year', 'month', 'day']])
 df['date'] = df['date'].astype(str)
 df['time'] = df['time'].astype(str)

# merge time and date and place result in our column

df['tmp'] =pd.to_datetime(df['date']+ ' '+df['time'])

# drop the added columns

df.drop(df[['date','year', 'month', 'day','time']], axis=1, inplace = True)