flask_sqlalchemy `pool_pre_ping` only working sometimes

For testing, I amend the MYSQL (RDS) parameters as follows;

wait_timeout = 40 (default was 28800)

max_allowed_packet = 1GB (max - just to be sure issue not caused by small packets)

net_read_timeout = 10

interactive_timeout unchanged

Then tested my app without pool_pre_ping options set (defaults to False), kept the app inactive for 40 seconds, tried to login, and i get

Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]: Traceback (most recent call last):
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:   File "/var/www/api_server/venv/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:     context)
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:   File "/var/www/api_server/venv/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:     cursor.execute(statement, parameters)
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:   File "/var/www/api_server/venv/lib/python3.6/site-packages/MySQLdb/cursors.py", line 206, in execute
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:     res = self._query(query)
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:   File "/var/www/api_server/venv/lib/python3.6/site-packages/MySQLdb/cursors.py", line 312, in _query
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:     db.query(q)
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:   File "/var/www/api_server/venv/lib/python3.6/site-packages/MySQLdb/connections.py", line 224, in query
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]:     _mysql.connection.query(self, query)
Nov 14 20:05:20 ip-172-31-33-52 gunicorn[16962]: MySQLdb._exceptions.OperationalError: (2013, 'Lost connection to MySQL server during query')

Added the pool_pre_ping like this (Using flask_sqlalchamy version 2.4.1);

import os
from flask import Flask
from flask_sqlalchemy import SQLAlchemy as _BaseSQLAlchemy


class SQLAlchemy(_BaseSQLAlchemy):
    def apply_pool_defaults(self, app, options):
        super(SQLAlchemy, self).apply_pool_defaults(app, options)
        options["pool_pre_ping"] = True
#        options["pool_recycle"] = 30
#        options["pool_timeout"] = 35

db = SQLAlchemy()


class DevConfig():
    SQLALCHEMY_ENGINE_OPTIONS = {'pool_recycle': 280, 'pool_timeout': 100, 'pool_pre_ping': True} # These configs doesn't get applied in engine configs :/
    DEBUG = True
    # SERVER_NAME = '127.0.0.1:5000'
    SQLALCHEMY_DATABASE_URI = os.getenv('SQLALCHEMY_DATABASE_URI_DEV')
    SQLALCHEMY_TRACK_MODIFICATIONS = False

config = dict(
    dev=DevConfig,
)

app = Flask(__name__, instance_relative_config=True)
app.config.from_object(config['dev'])

# INIT DATABASE
db.init_app(app)
with app.app_context():
    db.create_all()

-----------run.py
app.run(host='127.0.0.1', port=5000)

With this, now the webapp manages to get new connection even after MySQL server has closed the previous connection. It always works fine when I access the database right after its closed by server (tried max 50 seconds after)... but when I keep connection inactive for long time (haven't noted, but ~ >10-15 min), again I see same error.
According to the docs, (especially the section Dealing with disconnects), the pool_pre_ping option should handle this kind of scenario at background rite? Or is there any other timeout variable that I need to change in MySQL server?


Solution 1:

From the Flask-SQLAlchemy Configuration docs:

Certain database backends may impose different inactive connection timeouts, which interferes with Flask-SQLAlchemy’s connection pooling.

By default, MariaDB is configured to have a 600 second timeout. This often surfaces hard to debug, production environment only exceptions like

2013: Lost connection to MySQL server during query.

If you are using a backend (or a pre-configured database-as-a-service) with a lower connection timeout, it is recommended that you set SQLALCHEMY_POOL_RECYCLE to a value less than your backend’s timeout.

The script cited in the question shows discrepancies between its MySQL timeout-configs (wait_timeout, net_read_timeout) and its SQLAlchemy (pool_recycle, pool_timeout) and Flask-SQLAlchemy timeouts (SQLALCHEMY_POOL_RECYCLE, SQLALCHEMY_POOL_TIMEOUT).

We can resolve this by using the DevConfig helper-class to coordinate the db connection config constants across the app. To do that, we assign our config to static attributes and refer back to them so that there are no conflicting timeout expectations. Here is an implementation:

import os
from flask import Flask
from flask_sqlalchemy import SQLAlchemy as _BaseSQLAlchemy

# Coordinate DevConfig with SQLAlchemy and Flask-SQLAlchemy (don't repeat yourself!)

class DevConfig():
    SQLALCHEMY_POOL_RECYCLE = 35  # value less than backend’s timeout
    SQLALCHEMY_POOL_TIMEOUT = 7  # value less than backend’s timeout
    SQLALCHEMY_PRE_PING = True
    SQLALCHEMY_ENGINE_OPTIONS = {'pool_recycle': SQLALCHEMY_POOL_RECYCLE, 'pool_timeout': SQLALCHEMY_POOL_TIMEOUT, 'pool_pre_ping': SQLALCHEMY_PRE_PING}
    DEBUG = True
    # SERVER_NAME = '127.0.0.1:5000'
    SQLALCHEMY_DATABASE_URI = os.getenv('SQLALCHEMY_DATABASE_URI_DEV')
    SQLALCHEMY_TRACK_MODIFICATIONS = False

class SQLAlchemy(_BaseSQLAlchemy):
    def apply_pool_defaults(self, app, options):
        super(SQLAlchemy, self).apply_pool_defaults(app, options)
        options["pool_pre_ping"] = DevConfig.SQLALCHEMY_PRE_PING
#        options["pool_recycle"] = 30
#        options["pool_timeout"] = 35

db = SQLAlchemy()

config = dict(
    dev=DevConfig,
)

app = Flask(__name__, instance_relative_config=True)
app.config.from_object(config['dev'])

# INIT DATABASE
db.init_app(app)
with app.app_context():
    db.create_all()

If you like, you can check the diff for the changes I made: diffchecker.com/Q1e85Hhc