How is it possible for this Python query to alternate between two different results?
I am executing the following code,
import psycopg2
import pandas as pd
@contextmanager
def get_psyopg2_connection(server='my_server', port=1234, read_only=True):
try:
conn = psycopg2.connect("<connection_string>")
conn.set_session(readonly=read_only, autocommit=True)
yield conn
except Exception as e:
raise Exception(e) from e
finally:
conn.close()
if __name__=="__main__":
with get_psyopg2_connection() as conn_dev:
df = pd.read_sql("""select last_value from public.lime_boxes_id_seq""", conn_dev)
print(df)
and I receive two different results; last_value = 6 (correct) and 33 (incorrect). When I run the same code several times, I might get a result sequence like 6, 6, 6, 33, 33, 6. It's unpredictable.
When I execute this query in DBeaver, I always get 6.
How could this be happening?
Edit:
This table is a sequence and no one is accessing this schema except for myself. Here is the content
last_value | log_cnt | is_called |
---|---|---|
6 | 27 | true |
There are 3 distinct possibilities that stand out to me:
-
There are multiple records in
public.lime_boxes_id_seq
, and since you haven't specified anORDER BY
clause, the database will not guarantee the order in which the records are returned. While this is definitely a low likelihood, it's still worth exploring what data the table being queried actually contains and to work from there. -
The database server in question (
my_server
) is actually some sort of load balancer for a cluster of Postgres nodes behind it, and there is one or more nodes that have fallen out of sync with the rest and is returning old/incorrect data. This would explain entirely the reason your DBeaver client gets the same result each time (it's probably not connecting/disconnecting/reconnecting between queries, potentially to different servers each time and likely maintaining the same connection between queries), but your code (which connects/disconnects/reconnects each execution) does not. To alleviate this, you can connect to a known-good node in the cluster (if your networking setup permits such an arrangement), or fix the replication issue that causes the staleness of the data. -
If the value in question was ever 33, it might be possible there is some sort of caching mechanism in the network between your code and the table in Postgres that is returning this erroneous data. If it was cached by this appliance/these appliances and hasn't been flushed since then, this could also be a reason why this data isn't returning what you believe it should, and a cache flush/reconfiguration would be necessary.