How do I manage third-party Python libraries with Google App Engine? (virtualenv? pip?)
What's the best strategy for managing third-party Python libraries with Google App Engine?
Say I want to use Flask, a webapp framework. A blog entry says to do this, which doesn't seem right:
$ cd /tmp/
$ wget http://pypi.python.org/packages/source/F/Flask/Flask-0.6.1.tar.gz
$ tar zxf Flask-0.6.1.tar.gz
$ cp -r Flask-0.6.1/flask ~/path/to/project/
(... repeat for other packages ...)
There must be a better way to manage third-party code, especially if I want to track versions, test upgrades or if two libraries share a subdirectory. I know that Python can import modules from zipfiles and that pip can work with a wonderful REQUIREMENTS file, and I've seen that pip has a zip
command for use with GAE.
(Note: There's a handful of similar questions — 1, 2, 3, 4, 5 — but they're case-specific and don't really answer my question.)
Solution 1:
Here's how I do it:
- project
- .Python
- bin
- lib
- python2.5
- site-packages
- < pip install packages here >
- site-packages
- python2.5
- include
- src
- app.yaml
- index.yaml
- main.yaml
- < symlink the pip installed packages in ../lib/python2.5/site-packages
The project
directory is the top level directory where the virtualenv sits. I get the virtualenv using the following commands:
cd project
virtualenv -p /usr/bin/python2.5 --no-site-packages --distribute .
The src
directory is where all your code goes. When you deploy your code to GAE, *only* deploy those in the src directory and nothing else. The appcfg.py
will resolve the symlinks and copy the library files to GAE for you.
I don't install my libraries as zip files mainly for convenience in case I need to read the source code, which I happen to do a lot just out of curiosity. However, if you really want to zip the libraries, put the following code snippet into your main.py
import sys
for p in ['librarie.zip', 'package.egg'...]:
sys.path.insert(0, p)
After this you can import your zipped up packages as usual.
One thing to watch out for is setuptools' pkg_resources.py
. I copied that directly into my src
directory so my other symlinked packages can use it. Watch out for anything that uses entry_point
s. In my case I'm using Toscawidgets2 and I had to dig into the source code to manually wire up the pieces. It can become annoying if you had a lot of libraries that rely on entry_point
.
Solution 2:
What about simply:
$ pip install -r requirements.txt -t <your_app_directory/lib>
Create/edit <your_app_directory>/appengine_config.py
:
"""This file is loaded when starting a new application instance."""
import sys
import os.path
# add `lib` subdirectory to `sys.path`, so our `main` module can load
# third-party libraries.
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'lib'))
UPDATE:
Google updated their sample to appengine_config.py
, like:
from google.appengine.ext import vendor
vendor.add('lib')
Note: Even though their example has .gitignore
ignoring lib/
directory you still need to keep that directory under source control if you use git-push
deployment method.