Where is Python's shutdown procedure setting module globals to None documented?
CPython has a strange behaviour where it sets modules to None during shutdown. This screws up error logging during shutdown of some multithreading code I've written.
I can't find any documentation of this behaviour. It's mentioned in passing in PEP 432:
[...] significantly reducing the number of modules that will experience the "module globals set to None" behaviour that is used to deliberate break cycles and attempt to releases more external resources cleanly.
There are SO questions about this behaviour and the C API documentation mentions shutdown behaviour for embedded interpreters.
I've also found a related thread on python-dev and a related CPython bug:
This patch does not change the behavior of module objects clearing their globals dictionary as soon as they are deallocated.
Where is this behaviour documented? Is it Python 2 specific?
Solution 1:
The behaviour is not well documented, and is present in all versions of Python from about 1.5-ish until Python 3.4:
As part of this change, module globals are no longer forcibly set to
None
during interpreter shutdown in most cases, instead relying on the normal operation of the cyclic garbage collector.
The only documentation for the behaviour is the moduleobject.c
source code:
/* To make the execution order of destructors for global
objects a bit more predictable, we first zap all objects
whose name starts with a single underscore, before we clear
the entire dictionary. We zap them by replacing them with
None, rather than deleting them from the dictionary, to
avoid rehashing the dictionary (to some extent). */
Note that setting the values to None
is an optimisation; the alternative would be to delete names from the mapping, which would lead to different errors (NameError
exceptions rather than AttributeError
s when trying to use globals from a __del__
handler).
As you found out on the mailinglist, the behaviour predates the cyclic garbage collector; it was added in 1998, while the cyclic garbage collector was added in 2000. Since function objects always reference the module __dict__
all function objects in a module involve circular references, which is why the __dict__
needed clearing before GC came into play.
It was kept in place even when cyclic GC was added, because there might be objects with __del__
methods involved in cycles. These aren't otherwise garbage-collectable, and cleaning out the module dictionary would at least remove the module __dict__
from such cycles. Not doing that would keep all referenced globals of that module alive.
The changes made for PEP 442 now make it possible for the garbage collector to clear cyclic references with objects that provide a __del__
finalizer, removing the need to clear the module __dict__
for most cases. The code is still there but this is only triggered if the __dict__
attribute is still alive even after moving the contents of sys.modules
to weak references and starting a GC collection run when the interpreter is shutting down; the module finalizer simply decrements their reference count.