Why is using thread locals in Django bad?

I am using thread locals to store the current user and request objects. This way I can have easy access to the request from anywhere in the programme (e.g. dynamic forms) without having to pass them around.

To implement the thread locals storage in a middleware, I followed a tutorial on the Django site: https://web.archive.org/web/20091128195932/http://code.djangoproject.com:80/wiki/CookBookThreadlocalsAndUser

This document has since been modified to suggest avoiding this technique: https://web.archive.org/web/20110504132459/http://code.djangoproject.com/wiki/CookBookThreadlocalsAndUser

From the article:

From a design point of view, threadlocals are essentially global variables, and are subject to all the usual problems of portability and predictability that global variables usually entail.

More importantly, from a security point of view, threadlocals pose a huge risk. By providing an data store that exposes the state of other threads, you provide a way for one thread in your web server to potentially modify the state of another thread in the system. If the threadlocal data contains descriptions of users or other authentication-related data, that data could be used as the basis for an attack that grants access to an unauthorized user, or exposes private details of a user. While it is possible to build a threadlocal system that is safe from this sort of attack, it's a lot easier to be defensive and build a system that isn't subject to any such vulnerability in the first place.

I understand why global variables can be bad, but in this case I'm running my own code on my own server so I can't see what danger two global variables pose.

Can someone explain the security issue involved? I have asked many people how they would hack my application if they read this article and know I'm using thread locals, yet no one has been able to tell me. I am starting to suspect that this is an opinion held by hair-splitting purists who love to pass objects explicitly.


Solution 1:

I disagree entirely. TLS is extremely useful. It should be used with care, just as globals should be used with care; but saying it shouldn't be used at all is just as ridiculous as saying globals should never be used.

For example, I store the currently active request in TLS. This makes it accessible from my logging class, without having to pass the request around through every single interface--including many that don't care about Django at all. It lets me make log entries from anywhere in the code; the logger outputs to a database table, and if a request happens to be active when a log is made, it logs things like the active user and what was being requested.

If you don't want one thread to have the capability of modifying another thread's TLS data, then set your TLS up to prohibit this, which probably requires using a native TLS class. I don't find that argument convincing, though; if an attacker can execute arbitrary Python code as your backend, your system is already fatally compromised--he could monkey patch anything to be run later as a different user, for example.

Obviously, you'll want to clear any TLS at the end of a request; in Django, that means clearing it in process_response and process_exception in a middleware class.

Solution 2:

Despite the fact that you could mix up data from different users, thread locals should be avoided because they hide a dependency. If you pass arguments to a method you see and know what you're passing. But a thread local is something like a hidden channel in the background and you may wonder, that a method doesn't work correctly in some cases.

There are some cases where thread locals are a good choice, but they should be used rarely and carefully!

Solution 3:

A quick example on how to create a TLS middleware compatible with the latest Django 1.10:

# coding: utf-8
# Copyright (c) Alexandre Syenchuk (alexpirine), 2016

try:
    from threading import local
except ImportError:
    from django.utils._threading_local import local

_thread_locals = local()

def get_current_request():
    return getattr(_thread_locals, 'request', None)

def get_current_user():
    request = get_current_request()
    if request:
        return getattr(request, 'user', None)

class ThreadLocalMiddleware(object):
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        _thread_locals.request = request
        return self.get_response(request)

Solution 4:

This question is really old, but I just saw someone referring to it, so I just want to note that the wiki page cited by this question stopped recommending threadlocal storage in 2010 and then was deleted altogether by 2012.