how to creat spark dataframe from a Map(string,any) scala?
schema = val schema = StructType(List(
StructField("col1", IntegerType, nullable = true),
StructField("col2", DoubleType, nullable = true),
))
val empty_df = spark.createDataFrame(spark.sparkContext.emptyRDD[Row], schema)
val tempFillMap:Map[String,Any]= Map("col1"->3,"col2"->4.0)
how can i create or update the dataframe from the tempFillMap?
Solution 1:
If map values have simple types (children of AnyVal), schema can be constructed from values types, and Dataframe created with such schema:
def getSparkType(value: Any): DataType = value match {
case _: Int => IntegerType
case _: Double => DoubleType
// TODO include other types here
case _ => throw new IllegalArgumentException(s"Value $value type cannot be converted to Spark type!")
}
val schema = tempFillMap.map(kv => StructField(kv._1, getSparkType(kv._2)))
import collection.JavaConverters._
val df = spark.createDataFrame(List(Row(tempFillMap.values.toSeq: _*)).asJava, StructType(schema.toArray))
Result:
+----+----+
|col1|col2|
+----+----+
|3 |4.0 |
+----+----+
If schema is predefined, and map contains values for all fields, map values can be put in correct order, and added as Row:
val valuesInCorrectOrder = schema.fieldNames.map(name => tempFillMap(name))
val df = spark.createDataFrame(List(Row(valuesInCorrectOrder: _*)).asJava, schema)