How to assign an empty value to a column of complex type?
I am trying to convert a dataframe into a dataset corresponding to the model EmailToSend
.
My models object:
object Models {
case class EmailToSend(
a1: String,
promoCodeTemplate: Option[PromoCodeTemplate]
)
case class PromoCodeTemplate(
b1: String
)
}
My code:
val myDataset: Dataset[Models.EmailToSend] = myDf.as[Models.EmailToSend]
myDf
contains all columns required by EmailToSend, except promoCodeTemplate. As a consequence, this code fails at runtime:
cannot resolve '`promoCodeTemplate`' given input columns: [a1];
promoCodeTemplate
is missing from that dataframe, which is what I expect. It will be filled later, but for now it has to be empty: there is no promo code template, this is normal.
The problem is that I cannot make this work without filling it with a promo code template. I tried to add an empty value with a withColumn
but no value I tried worked.
val myDataset: Dataset[Models.EmailToSend] = myDf
// this is one of the many values I tried
.withColumn("promoCodeTemplate", lit(null.asInstanceOf[Models.PromoCodeTemplate]).cast(Models.PromoCodeTemplate))
.as[Models.EmailToSend]
How do I assign an empty value to the column promoCodeTemplate
?
You should create an empty struct field to match spark type with case class PromoCodeTemplate
.
val myDataset = myDF.withColumn("promoCodeTemplate", struct(lit("").as("b1"))).as[EmailToSend]
Or, you can use below line also,
myDF.withColumn("promoCodeTemplate", typedLit(PromoCodeTemplate(""))).as[EmailToSend]
To simply add null value,
myDF.withColumn("promoCodeTemplate", typedLit(null.asInstanceOf[PromoCodeTemplate])).as[EmailToSend]