celery not operating on the same object in flask server [duplicate]

In my application, the state of a common object is changed by making requests, and the response depends on the state.

class SomeObj():
    def __init__(self, param):
        self.param = param
    def query(self):
        self.param += 1
        return self.param

global_obj = SomeObj(0)

@app.route('/')
def home():
    flash(global_obj.query())
    render_template('index.html')

If I run this on my development server, I expect to get 1, 2, 3 and so on. If requests are made from 100 different clients simultaneously, can something go wrong? The expected result would be that the 100 different clients each see a unique number from 1 to 100. Or will something like this happen:

  1. Client 1 queries. self.param is incremented by 1.
  2. Before the return statement can be executed, the thread switches over to client 2. self.param is incremented again.
  3. The thread switches back to client 1, and the client is returned the number 2, say.
  4. Now the thread moves to client 2 and returns him/her the number 3.

Since there were only two clients, the expected results were 1 and 2, not 2 and 3. A number was skipped.

Will this actually happen as I scale up my application? What alternatives to a global variable should I look at?


You can't use global variables to hold this sort of data. Not only is it not thread safe, it's not process safe, and WSGI servers in production spawn multiple processes. Not only would your counts be wrong if you were using threads to handle requests, they would also vary depending on which process handled the request.

Use a data source outside of Flask to hold global data. A database, memcached, or redis are all appropriate separate storage areas, depending on your needs. If you need to load and access Python data, consider multiprocessing.Manager. You could also use the session for simple data that is per-user.


The development server may run in single thread and process. You won't see the behavior you describe since each request will be handled synchronously. Enable threads or processes and you will see it. app.run(threaded=True) or app.run(processes=10). (In 1.0 the server is threaded by default.)


Some WSGI servers may support gevent or another async worker. Global variables are still not thread safe because there's still no protection against most race conditions. You can still have a scenario where one worker gets a value, yields, another modifies it, yields, then the first worker also modifies it.


If you need to store some global data during a request, you may use Flask's g object. Another common case is some top-level object that manages database connections. The distinction for this type of "global" is that it's unique to each request, not used between requests, and there's something managing the set up and teardown of the resource.


This is not really an answer to thread safety of globals.

But I think it is important to mention sessions here. You are looking for a way to store client-specific data. Every connection should have access to its own pool of data, in a threadsafe way.

This is possible with server-side sessions, and they are available in a very neat flask plugin: https://pythonhosted.org/Flask-Session/

If you set up sessions, a session variable is available in all your routes and it behaves like a dictionary. The data stored in this dictionary is individual for each connecting client.

Here is a short demo:

from flask import Flask, session
from flask_session import Session

app = Flask(__name__)
# Check Configuration section for more details
SESSION_TYPE = 'filesystem'
app.config.from_object(__name__)
Session(app)

@app.route('/')
def reset():
    session["counter"]=0

    return "counter was reset"

@app.route('/inc')
def routeA():
    if not "counter" in session:
        session["counter"]=0

    session["counter"]+=1

    return "counter is {}".format(session["counter"])

@app.route('/dec')
def routeB():
    if not "counter" in session:
        session["counter"] = 0

    session["counter"] -= 1

    return "counter is {}".format(session["counter"])


if __name__ == '__main__':
    app.run()

After pip install Flask-Session, you should be able to run this. Try accessing it from different browsers, you'll see that the counter is not shared between them.


Another example of a data source external to requests is a cache, such as what's provided by Flask-Caching or another extension.

  1. Create a file common.py and place in it the following:
from flask_caching import Cache

# Instantiate the cache
cache = Cache()
  1. In the file where your flask app is created, register your cache with the following code:
# Import cache
from common import cache

# ...
app = Flask(__name__)

cache.init_app(app=app, config={"CACHE_TYPE": "filesystem",'CACHE_DIR': Path('/tmp')})
  1. Now use throughout your application by importing the cache and executing as follows:
# Import cache
from common import cache

# store a value
cache.set("my_value", 1_000_000)

# Get a value
my_value = cache.get("my_value")

While totally accepting the previous upvoted answers, and discouraging use of global variables for production and scalable Flask storage, for the purpose of prototyping or really simple servers, running under the flask 'development server'...

...

The Python built-in data types, and I personally used and tested the global dict, as per Python documentation are thread safe. Not process safe.

The insertions, lookups, and reads from such a (server global) dict will be OK from each (possibly concurrent) Flask session running under the development server.

When such a global dict is keyed with a unique Flask session key, it can be rather useful for server-side storage of session specific data otherwise not fitting into the cookie (max size 4 kB).

Of course, such a server global dict should be carefully guarded for growing too large, being in-memory. Some sort of expiring the 'old' key/value pairs can be coded during request processing.

Again, it is not recommended for production or scalable deployments, but it is possibly OK for local task-oriented servers where a separate database is too much for the given task.

...