How to convert unix timestamp to date in Spark
I have a data frame with a column of unix timestamp(eg.1435655706000), and I want to convert it to data with format 'yyyy-MM-DD', I've tried nscala-time but it doesn't work.
val time_col = sqlc.sql("select ts from mr").map(_(0).toString.toDateTime)
time_col.collect().foreach(println)
and I got error: java.lang.IllegalArgumentException: Invalid format: "1435655706000" is malformed at "6000"
Here it is using Scala DataFrame functions: from_unixtime and to_date
// NOTE: divide by 1000 required if milliseconds
// e.g. 1446846655609 -> 2015-11-06 21:50:55 -> 2015-11-06
mr.select(to_date(from_unixtime($"ts" / 1000)))
Since spark1.5 , there is a builtin UDF for doing that.
val df = sqlContext.sql("select from_unixtime(ts,'YYYY-MM-dd') as `ts` from mr")
Please check Spark 1.5.2 API Doc for more info.