update query in Spark SQL
Spark SQL doesn't support UPDATE
statements yet.
Hive has started supporting UPDATE
since hive version 0.14. But even with Hive, it supports updates/deletes only on those tables that support transactions, it is mentioned in the hive documentation.
See the answers in databricks forums confirming that UPDATES/DELETES are not supported in Spark SQL as it doesn't support transactions. If we think, supporting random updates is very complex with most of the storage formats in big data. It requires scanning huge files, updating specific records and rewriting potentially TBs of data. It is not normal SQL.
Now it possible, with Databricks Delta Lake
Spark SQL now supports update, delete and such data modification operations if the underlying table is in delta format.
Check this out: https://docs.delta.io/0.4.0/delta-update.html#update-a-table