Why Spark writes Null in DeltaLake Table
Solution 1:
After refactoring my java code and adding a code segment for debug I was able to identify the error.
See the refactoring:
StreamingQuery query = dataStreamReader.load()
.as(Encoders.STRING())
.map((MapFunction<String, StringArray>) x -> new StringArray(x),
stringArrayEncoder)
.map((MapFunction<StringArray, RDMessage>)
r -> new RDMessage(r), messageEncoder)
.map((MapFunction<RDMessage, RDMeasurement>) e ->
e.getRdMeasurement(), measurementEncoder)
/*
.map((MapFunction<RDMeasurement, String>) e -> {
if (e.getDataSourceName() != null) {
System.out.println("•••> " + e);
}
return e.toString();
}, Encoders.STRING())
.map((MapFunction<String, RDMeasurement>) s -> new RDMeasurement(s),
measurementEncoder)
*/
.writeStream()
.outputMode("append")
.format("console")
.start();
query.awaitTermination();
The code commented above allowed me to identify the problem.