Schrödinger's variable: the __class__ cell magically appears if you're checking for its presence?

There's a surprise here:

>>> class B:
...     print(locals())
...     def foo(self):
...         print(locals())
...         print(__class__ in locals().values())
...         
{'__module__': '__main__', '__qualname__': 'B'}
>>> B().foo()
{'__class__': <class '__main__.B'>, 'self': <__main__.B object at 0x7fffe916b4a8>}
True

It seems like the mere mention of __class__ is explicitly checked by the parser? Otherwise we should get something like

NameError: name '__class__' is not defined

Indeed, if you modify to only check the key instead, i.e. check for '__class__' in locals(), then we only have self in scope as expected.

How does it happen that this variable gets magically injected into scope? My guess is this is something to do with super - but I didn't use super, so why does the compiler create an implicit closure reference here if it isn't needed?


This is a weird interaction in Python 3's implementation of no-argument super. An access to super in a method triggers the addition of a hidden __class__ closure variable referring to the class that defines the method. The parser special-cases a load of the name super in a method by also adding __class__ to the method's symbol table, and then the rest of the relevant code all looks for __class__ instead of super. However, if you try to access __class__ yourself, all the code looking for __class__ sees it and thinks it should do the super handling!

Here's where it adds the name __class__ to the symbol table if it sees super:

case Name_kind:
    if (!symtable_add_def(st, e->v.Name.id,
                          e->v.Name.ctx == Load ? USE : DEF_LOCAL))
        VISIT_QUIT(st, 0);
    /* Special-case super: it counts as a use of __class__ */
    if (e->v.Name.ctx == Load &&
        st->st_cur->ste_type == FunctionBlock &&
        !PyUnicode_CompareWithASCIIString(e->v.Name.id, "super")) {
        if (!GET_IDENTIFIER(__class__) ||
            !symtable_add_def(st, __class__, USE))
            VISIT_QUIT(st, 0);
    }
    break;

Here's drop_class_free, which sets ste_needs_class_closure:

static int
drop_class_free(PySTEntryObject *ste, PyObject *free)
{
    int res;
    if (!GET_IDENTIFIER(__class__))
        return 0;
    res = PySet_Discard(free, __class__);
    if (res < 0)
        return 0;
    if (res)
        ste->ste_needs_class_closure = 1;
    return 1;
}

The compiler section that checks ste_needs_class_closure and creates the implicit cell:

if (u->u_ste->ste_needs_class_closure) {
    /* Cook up an implicit __class__ cell. */
    _Py_IDENTIFIER(__class__);
    PyObject *tuple, *name, *zero;
    int res;
    assert(u->u_scope_type == COMPILER_SCOPE_CLASS);
    assert(PyDict_Size(u->u_cellvars) == 0);
    name = _PyUnicode_FromId(&PyId___class__);
    if (!name) {
        compiler_unit_free(u);
        return 0;
    }
    ...

There's more relevant code, but it's too much to include all of it. Python/compile.c and Python/symtable.c are where to look if you want to see more.

You can get some weird bugs if you try to use a variable named __class__:

class Foo:
    def f(self):
        __class__ = 3
        super()

Foo().f()

Output:

Traceback (most recent call last):
  File "./prog.py", line 6, in <module>
  File "./prog.py", line 4, in f
RuntimeError: super(): __class__ cell not found

The assignment to __class__ means __class__ is a local variable instead of a closure variable, so the closure cell super() needs isn't there.

def f():
    __class__ = 2
    class Foo:
        def f(self):
            print(__class__)

    Foo().f()

f()

Output:

<class '__main__.f.<locals>.Foo'>

Even though there's an actual __class__ variable in the enclosing scope, the special-casing of __class__ means you get the class instead of the enclosing scope's variable value.


https://docs.python.org/3/reference/datamodel.html#creating-the-class-object

__class__ is an implicit closure reference created by the compiler if any methods in a class body refer to either __class__ or super. This allows the zero argument form of super() to correctly identify the class being defined based on lexical scoping, while the class or instance that was used to make the current call is identified based on the first argument passed to the method.