Incremental updates to a Transformed Table
I am trying to setup an ELT pipeline into Snowflake and it involves a transformation after loading.
This transformation will currently create or replace a Table using data queried from a source table in Snowflake after performing some manipulations of JSON data.
My question is, is this the proper way of doing it via create or replace Table everytime the transformation runs or is there a way to update the data in the transformed table incrementally?
Any advise would be greatly appreciated!
Thanks!
You can Insert into the load (soruce) table, and put into a stream, then you can know the rows, ranges of rows that need to be "reviewed" and then upsert into the output transform table.
That is is you doing something like "daily aggregates", thus if in "this batch you have data for the last 4 days, you then read the "last four days" of data from source (space a full read) and then aggregate and upsert via merge command. Thus with the model you can save reads/aggregate/write.
We have also used high water tables, to know last seen data, and/or lowest value in current batch.