How to assign an empty value to a column of complex type?

I am trying to convert a dataframe into a dataset corresponding to the model EmailToSend.

My models object:

object Models {
    case class EmailToSend(
        a1: String,
        promoCodeTemplate: Option[PromoCodeTemplate]
    )

    case class PromoCodeTemplate(
        b1: String
    )
}

My code:

val myDataset: Dataset[Models.EmailToSend] = myDf.as[Models.EmailToSend]

myDf contains all columns required by EmailToSend, except promoCodeTemplate. As a consequence, this code fails at runtime:

cannot resolve '`promoCodeTemplate`' given input columns: [a1];

promoCodeTemplate is missing from that dataframe, which is what I expect. It will be filled later, but for now it has to be empty: there is no promo code template, this is normal.

The problem is that I cannot make this work without filling it with a promo code template. I tried to add an empty value with a withColumn but no value I tried worked.

val myDataset: Dataset[Models.EmailToSend] = myDf
    // this is one of the many values I tried
    .withColumn("promoCodeTemplate", lit(null.asInstanceOf[Models.PromoCodeTemplate]).cast(Models.PromoCodeTemplate))
    .as[Models.EmailToSend]

How do I assign an empty value to the column promoCodeTemplate?


You should create an empty struct field to match spark type with case class PromoCodeTemplate.

val myDataset = myDF.withColumn("promoCodeTemplate", struct(lit("").as("b1"))).as[EmailToSend]

Or, you can use below line also,

myDF.withColumn("promoCodeTemplate", typedLit(PromoCodeTemplate(""))).as[EmailToSend]

To simply add null value,

myDF.withColumn("promoCodeTemplate", typedLit(null.asInstanceOf[PromoCodeTemplate])).as[EmailToSend]