Drag forward installed Python packages when upgrading

As somebody who runs multiple webservers each running a set of Django sites, keeping on top of my Python stack is very important. Out of (probably bad) habit, I rely on Ubuntu for a number of my Python packages, including python-django and a lot of python-django-* extras. The websites require these to run but as long as the package still exists, this isn't an issue. I do this rather than using VirtualEnv (et al) because I want Ubuntu to install security updates.

However the Ubuntu repos don't cater for everybody. There are cases where I'll use pip or easy_install to suck in the latest version of a Python package. When you update Python (as occasionally happens in Ubuntu), you lose all your pip-installed packages.

What terrifies me is the deeper I get, the more servers I administer, there's going to be an OS update one day that requires hours and hours of my time running around, testing sites, reinstalling python packages through pip. The worst bit of this is potential downtime for client sites though I do test on my development machine (always at Ubuntu-latest) so this should offset some of that worry.

Is there anything I can do to make sure updates to Python mean the existing, non-dpgk'd Python packages are brought forward?

That would make sure I always had access to the same packages. I'd still have to test for incompatibilities but it would be a good start.

There's perhaps one better solution: an application that behaved like apt and dpkg but for interacting with PyPi (where pip and easy_install get most of their mojo). Something that stored a local list of installed packages, checked for updates like apt, managed installing, etc. Does such a thing exist? Or is it a rubbish idea?

Regarding how to keep your Python packages when the system Python is upgraded: I see two options:

  1. You can install non-Ubuntu Python stuff with easy_install --install-dir /usr/local/python Then you make sure that all your webapps include that directory into sys.path, for instance by including it into PYTHONPATH, or using a directory that is automatically included by site.py (whose doc states that "Local addons go into /usr/local/lib/python<version>/dist-packages")

  2. You can use virtualenvs, provided you can place all your app data and configuration in a directory independent of the code. Here's a sketch procedure:

    a. Place all the code-independent stuff into directory myapp-data/

    b. Create virtualenv myapp-code.XXX/ (where XXX is some unique version number, e.g., date -I)

    c. Place app code and all dependency packages in myapp-code.XXX

    d. ln -s myapp-code.XXX myapp-code

    When you have to upgrade, you just repeat steps b. and c. with a different revision code YYY, then: stop currently running app, symlink myapp-code to myapp-code.YYY, start app from virtualenv myapp-code.YYY. If something goes wrong, you can still roll back to the old virtualenv quickly.

Apparently, 2. is more work (but pip plus some shell-scripting will take you a long way towards automating it), but it should also be more robust and will allow you to concurrently run applications that depend on different versions of some Python package.

Regarding your question about an apt-get-like for Python packages: pip explicitly disallows such a thing, and for a good reason: package APIs and behavior may change across different versions. Therefore, if your code runs fine against version X, it may fail when run with version X+1. This is exactly what pip tries to prevent with its "freeze" and "requirements list" features.

Of course, the same argument can be applied to any program in a binary distribution like Ubuntu; indeed, what makes apt-get useful is that Debian and Ubuntu provide a coordinated release of interoperable packages: a lot of effort from the maintainers goes into ensuring that all Ubuntu packages in the main repositories are compatible.

There is just no such coordinated release of Python packages: each package is independent and no information is available about what version of other Python packages are compatible with it. (This could possibly be a good addition to PyPI metadata.)

I would use a configuration management system like puppet or chef to run scripts on all servers to keep them all in sync. Your scripts (recipes) can manage the upgrade including automatic testing if the upgrade was successful or broke something.