Large table issue in Snowflake

2gb is not that large, and very much should not be taking 14m on a X-SMALL warehouse.

First rule of Snowflake, don't SELECT * FROM x, for two reasons,

  • The query compile has to wait for all meta data to be loaded for all partitions, before the plan can start being built as some partitions might have more data that the first partitions. Thus the output shape cannot be planned until all is known.
  • Second reason, when you "select all columns", all columns are loaded from disk, and if your data is unstructured JSON is has to rebuild all that data, which is "relatively expensive". You should name the columns you want, and only the columns you want.

If you are wanting to join to another table to do some filtering, just select the columns needed to do the filter, and the join, and then get the set of keys you want and re-join to the base table on those results (sometimes as a second query) so pruning can happen.

sigh, I have just looked at your stats a little hard 25K columns... sigh. This is not a database, this is something very painful..

As a strong opinion you cannot have a row of data that makes sense to have 25K related and meaning full columns. You have a table with a primary key, and it should have something like 25K rows of subtype data per attribute. Yes it means you have to exploded the data out via a PIVOT or the likes, but it's more honest about the relations present in the data, and how to process this volume of data.