Importing modules: __main__ vs import as module
The __name__
variable always contains the name of the module, except when the file has been loaded into the interpreter as a script instead. Then that variable is set to the string '__main__'
instead.
After all, the script is then run as the main file of the whole program, everything else are modules imported directly or indirectly by that main file. By testing the __name__
variable, you can thus detect if a file has been imported as a module, or was run directly.
Internally, modules are given a namespace dictionary, which is stored as part of the metadata for each module, in sys.modules
. The main file, the executed script, is stored in that same structure as '__main__'
.
But when you import a file as a module, python first looks in sys.modules
to see if that module has already been imported before. So, import mod1
means that we first look in sys.modules
for the mod1
module. It'll create a new module structure with a namespace if mod1
isn't there yet.
So, if you both run mod1.py
as the main file, and later import it as a python module, it'll get two namespace entries in sys.modules
. One as '__main__'
, then later as 'mod1'
. These two namespaces are completely separate. Your global var1
is stored in sys.modules['__main__']
, but func1B
is looking in sys.modules['mod1']
for var1
, where it is None
.
But when you use python driver.py
, driver.py
becomes the '__main__'
main file of the program, and mod1
will be imported just once into the sys.modules['mod1']
structure. This time round, func1A
stores var1
in the sys.modules['mod1']
structure, and that's what func1B
will find.
Regarding a practical solution for using a module optionally as main script - supporting consistent cross-imports:
Solution 1:
See e.g. in Python's pdb module, how it is run as a script by importing itself when executing as __main__
(at the end) :
#! /usr/bin/env python
"""A Python debugger."""
# (See pdb.doc for documentation.)
import sys
import linecache
...
# When invoked as main program, invoke the debugger on a script
if __name__ == '__main__':
import pdb
pdb.main()
Just I would recommend to reorganize the __main__
startup to the beginning of the script like this:
#! /usr/bin/env python
"""A Python debugger."""
# When invoked as main program, invoke the debugger on a script
import sys
if __name__ == '__main__':
##assert os.path.splitext(os.path.basename(__file__))[0] == 'pdb'
import pdb
pdb.main()
sys.exit(0)
import linecache
...
This way the module body is not executed twice - which is "costly", undesirable and sometimes critical.
Solution 2:
In rarer cases it is desirable to expose the actual script module __main__
even directly as the actual module alias (mod1
):
# mod1.py
import mod2
...
if __name__ == '__main__':
# use main script directly as cross-importable module
_mod = sys.modules['mod1'] = sys.modules[__name__]
##_modname = os.path.splitext(os.path.basename(os.path.realpath(__file__)))[0]
##_mod = sys.modules[_modname] = sys.modules[__name__]
func1A()
Known drawbacks:
-
reload(_mod)
fails - pickle'ed classes would need extra mappings for unpickling (
find_global
..)