Lock Issue in Snowflake while inserting data from pyspark dataframe
I am trying to insert a pyspark dataframe to a snowflake table with the help of following code:
sfOptions = {
"sfURL" : "XXXXXXXXXXXX",
"sfAccount" : "XXXXXXXXXXXX",
"sfUser" : "XXXXXXXXXXXX",
"sfPassword" : "XXXXXXXXXXXX",
"sfDatabase" : "XXXXXXXXXXXX",
"sfSchema" : "XXXXXXXXXXXX",
"sfWarehouse" : "XXXXXXXXXXXX",
"sfRole" : "XXXXXXXXXXXX",
"column_mapping" : "name",
"column_mismatch_behavior":"ignore"
}
df.write.format("snowflake") \
.options(**sfOptions) \
.option("dbtable", table).mode("append").save()
I am successfully able to insert the data to the table with the help of above code. But when i am trying to do the same operation in multi-threading I am getting the below error:
y4JJavaError: An error occurred while calling o146688.save. : net.snowflake.client.jdbc.SnowflakeSQLException: Statement '019fce57-0b02-036b-0000-467d0010e94a' has locked table 'ERROR_RECORDS' in transaction 1635015824820 and this lock has not yet been released. Your statement '019fce57-0b02-036b-0000-467d0010e9f2' was aborted because the number of waiters for this lock exceeds the 20 statements limit.
I am not able to hit the method concurrently please do let me know any suggestions.
There is a soft limit for the maximum number of waiters to acquire a lock on a resource (table/micro-partition). As the message indicates, you are hitting this limit (20).
So the question is, why are there 20+ sessions trying to modify the same table in a data warehouse environment? From your question, I assume that you try to increase the parallelism by multi-threading. IMO, you should use one single spark job, and let the Snowflake handle the parallelism when ingesting the data.
It's also possible to increase this limit if you contact Snowflake Support, but that will only make things worst because that will not increase parallelism, it will only enable more sessions to wait.
https://community.snowflake.com/s/article/Your-statement-was-aborted-because-the-number-of-waiters-for-this-lock-exceeds-the-20-statements-limit