Python Packaging: Data files are put properly in tar.gz file but are not installed to virtual environment
I can't properly install the project package_fiddler
to my virtual environment.
I have figured out that MANIFEST.in
is responsible for putting the non-.py files in Package_fiddler-0.0.0.tar.gz
that is generated when executing python setup.py sdist
.
Then I did:
(virt_envir)$ pip install dist/Package_fiddler-0.0.0.tar.gz
But this did not install the data files nor the package to /home/username/.virtualenvs/virt_envir/local/lib/python2.7/site-packages
.
I have tried many configurations of the setup arguments package_data
, include_package_data
and data_files
but I seem to have used the wrong configuration each time.
Which configuration of package_data
and/or include_package_data
and/or data_files
will properly install package_fiddler
to my virtual environment?
Project tree
.
├── MANIFEST.in
├── package_fiddler
│ ├── data
│ │ ├── example.html
│ │ └── stylesheets
│ │ └── example.css
│ └── __init__.py
├── README.rst
└── setup.py
setup.py
from setuptools import setup
setup(
name='Package_fiddler',
entry_points={
'console_scripts': ['package_fiddler = package_fiddler:main', ],},
long_description=open('README.rst').read(),
packages=['package_fiddler',])
MANIFEST.in
include README.rst
recursive-include package_fiddler/data *
Which configurations of setup.py(with code base above) have I tried?
Configuration1
Adding:
package_data={"": ['package_fiddler/data/*',]}
Configuration2
Adding:
package_data={"": ['*.html', '*.css', '*.rst']}
Configuration3
Adding:
include_package_data=True
Configuration4
Adding:
package_data={"": ['package_fiddler/data',]}
Removing:
packages=['package_fiddler',]
Configuration5 (Chris's suggestion)
Adding:
package_data={"data": ['package_fiddler/data',]}
Removing:
packages=['package_fiddler',]
Configuration 6
Adding:
package_data={"": ['package_fiddler/data/*',]}
Removing:
packages=['package_fiddler',]
These configurations all result in no files at all being installed on /home/username/.virtualenvs/virt_envir/local/lib/python2.7/site-packages
.
EDIT
Note to Toshio Kuratomi:
In my original post I used the simplest tree structure where this problem occurs for clarity but in reality my tree looks more like the tree below. For that tree, strangely if I only put an __init__.py
in stylesheets
somehow all the data files in the texts
folder are also installed correctly!!! This baffles me.
Tree 2 (This installs all data files properly somehow!!)
.
├── MANIFEST.in
├── package_fiddler
│ │── stylesheets
| | ├── __init__.py
| | ├── example.css
| | └── other
| | └── example2.css
| |__ texts
| | ├── example.txt
| | └── other
| | └── example2.txt
│ └── __init__.py
├── README.rst
└── setup.py
Solution 1:
Found a solution that worked for me here.
Using setuptools==2.0.2
I did:
setuptools.setup(
...
packages=setuptools.find_packages(),
include_package_data=True, # use MANIFEST.in during install
...
)
Solution 2:
I personally dislike the way setuptools mixes code and data both conceptually and implementation-wise. I think that it's that implementation that is tripping you up here. For setuptools to find and use package_data it needs for the data to reside inside of a python package. A python package can be a directory but there needs to be a __init__.py
file in the directory. So it looks like you need the following (empty is fine) files:
./package_fiddler/data/__init__.py
./package_fiddler/data/stylesheets/__init__.py
Solution 3:
The easiest way to include package data in "setup.py" is like so:
package_data = {'<package name>': ['<path to data file within package dir>']}
So in your example:
package_data = {'package_fiddler': ['data/*', 'data/stylesheets/*']}
package_data
is a dictionary where the keys are the names of the packages included in the installer. The values under these keys should be lists of specific file paths or globs/wildcards within the package directory.
You also need to include the flag:
zip_safe=False
in setup(...)
if you want to be able to resolve file system paths to your data. Otherwise you can use pkg_resources
to do this: http://peak.telecommunity.com/DevCenter/PythonEggs#accessing-package-resources
You definitely don't need an __init__.py
file in the "data" directory - this directory is not a module and is not meant to be imported.