Postgres SSL SYSCALL error: EOF detected with python and psycopg
Solution 1:
The error: psycopg2.operationalerror: SSL SYSCALL error: EOF detected
The setup: Airflow + Redshift + psycopg2
When: Queries take a long time to execute (more than 300 seconds).
A socket timeout occurs in this instance. What solves this specific variant of the error is adding keepalive arguments to the connection string.
keepalive_kwargs = {
"keepalives": 1,
"keepalives_idle": 30,
"keepalives_interval": 5,
"keepalives_count": 5,
}
conection = psycopg2.connect(connection_string, **keepalive_kwargs)
Redshift requires a keepalives_idle
of less than 300. A value of 30 worked for me, your mileage may vary. It is also possible that the keepalives_idle
argument is the only one you need to set - but ensure keepalives
is set to 1.
Link to docs on postgres keepalives.
Link to airflow doc advising on 300 timeout.
Solution 2:
I ran into this problem when running a slow query in a Droplet on a Digital Ocean instance. All other SQL would run fine and it worked on my laptop. After scaling up to a 1 GB RAM instance instead of 512 MB it works fine so it seems that this error could occur if the process is running out of memory.
Solution 3:
This issue occurred for me when I had some rogue queries running causing tables to be locked indefinitely. I was able to see the queries by running:
SELECT * from STV_RECENTS where status='Running' order by starttime desc;
then kill them with:
SELECT pg_terminate_backend(<pid>);
Solution 4:
Very similar answer to what @FoxMulder900 did, except I could not get his first select to work. This works, though:
WITH long_running AS (
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '1 minutes'
and state = 'active'
)
SELECT * from long_running;
If you want to kill the processes from long_running
just comment out the last line and insert SELECT pg_cancel_backend(long_running.pid) from long_running ;