What's the purpose of the "__package__" attribute in Python?
See the PEP 366 and import system reference documentation:
The major proposed change is the introduction of a new module level attribute,
__package__
. When it is present, relative imports will be based on this attribute rather than the module__name__
attribute.
and
- The module’s
__package__
attribute should be set. Its value must be a string, but it can be the same value as its__name__
. If the attribute is set toNone
or is missing, the import system will fill it in with a more appropriate value. When the module is a package, its__package__
value should be set to its__name__
. When the module is not a package,__package__
should be set to the empty string for top-level modules, or for submodules, to the parent package’s name. See PEP 366 for further details.
So, for a module located in foo/bar/baz.py
, __name__
is set to foo.bar.baz
, and __package__
is set to foo.bar
, while foo/bar/__init__.py
will have foo.bar
for both the __name__
and __package__
attributes.
All I want to know is what exactly does
__package__
mean
It is the mechanism that enables explicit relative imports.
There are three possible categories of values for __package__
- A package name (a string)
- An empty string
- None
Package Name
That is, if a module is in a package, __package__
is set to the package name to enable explicit relative imports. Specifically:
When the module is a package, its
__package__
value should be set to its__name__
. When the module is not a package,__package__
should be set [...] for submodules, to the parent package’s name.
Empty String
If a module is at root, or top-level, that is, the current module is imported with
import current_module
or when a top-level module is run as the entry point as with:
$ python -m current_module
then __package__
is an empty string. Or as the documentation says:
When the module is not a package,
__package__
should be set to the empty string for top-level modules...
None
If a module/script is run by filename, __package__
is None:
When the main module is specified by its filename, then the
__package__
attribute will be set to None.
Evidence
First, let's create a file structure with noisy debugging - using Python 3.6:
text = "print(f'{__name__}, __file__: {__file__}, __package__: {repr(__package__)}')"
from pathlib import Path
Path('foo.py').write_text(text)
Path('package').mkdir()
Path('package/__init__.py').write_text(text)
Path('package/__main__.py').write_text(text)
Path('package/bar.py').write_text(text)
# and include a submodule with a relative import:
Path('package/baz.py').write_text(text + '\nfrom . import bar')
Now we see that foo.py executed as a module has an empty string for __package__
, while the script executed by file name as the entry point has None
:
$ python -m foo
__main__, __file__: ~\foo.py, __package__: ''
$ python foo.py
__main__, __file__: foo.py, __package__: None
When we execute a package as a module for the entry point, its __init__.py
module runs, then its __main__.py
runs:
$ python -m package
package, __file__: ~\package\__init__.py, __package__: 'package'
__main__, __file__: ~\package\__main__.py, __package__: 'package'
Similarly, when we execute a submodule as a module for the entry point, the __init__.py
module runs, then it runs:
$ python -m package.bar
package, __file__: ~\package\__init__.py, __package__: 'package'
__main__, __file__: ~\package\bar.py, __package__: 'package'
Finally, we see that the explicit relative import, the entire reason for having __package__
, (which happens last here) is enabled:
$ python -m package.baz
package, __file__: ~\package\__init__.py, __package__: 'package'
__main__, __file__: ~\package\baz.py, __package__: 'package'
package.bar, __file__: ~\package\bar.py, __package__: 'package'
Note, in the output, I have substituted ~
for the parent directories.