How are you planning on handling the migration to Python 3?

I'm sure this is a subject that's on most python developers' minds considering that Python 3 is coming out soon. Some questions to get us going in the right direction:

  1. Will you have a python 2 and python 3 version to be maintained concurrently or will you simply have a python 3 version once it's finished?

    • Have you already started or plan on starting soon? Or do you plan on waiting until the final version comes out to get into full swing?

Here's the general plan for Twisted. I was originally going to blog this, but then I thought: why blog about it when I could get points for it?

  1. Wait until somebody cares.

    Right now, nobody has Python 3. We're not going to spend a bunch of effort until at least one actual user has come forth and said "I need Python 3.0 support", and has a good reason for it aside from the fact that 3.0 looks shiny.

  2. Wait until our dependencies have migrated.

    A large system like Twisted has a number of dependencies. For starters, ours include:

    • Zope Interface
    • PyCrypto
    • PyOpenSSL
    • pywin32
    • PyGTK (though this dependency is sadly very light right now, by the time migration rolls around, I hope Twisted will have more GUI tools)
    • pyasn1
    • PyPAM
    • gmpy

    Some of these projects have their own array of dependencies so we'll have to wait for those as well.

  3. Wait until somebody cares enough to help.

    There are, charitably, 5 people who work on Twisted - and I say "charitably" because that's counting me, and I haven't committed in months. We have over 1000 open tickets right now, and it would be nice to actually fix some of those — fix bugs, add features, and generally make Twisted a better product in its own right — before spending time on getting it ported over to a substantially new version of the language.

    This potentially includes sponsors caring enough to pay for us to do it, but I hope that there will be an influx of volunteers who care about 3.0 support and want to help move the community forward.

  4. Follow Guido's advice.

    This means we will not change our API incompatibly, and we will follow the transitional development guidelines that Guido posted last year. That starts with having unit tests, and running the 2to3 conversion tool over the Twisted codebase.

  5. Report bugs against, and file patches for, the 2to3 tool.

    When we get to the point where we're actually using it, I anticipate that there will be a lot of problems with running 2to3 in the future. Running it over Twisted right now takes an extremely long time and (last I checked, which was quite a while ago) can't parse a few of the files in the Twisted repository, so the resulting output won't import. I think there will have to be a fair amount of success stories from small projects and a lot of hammering on the tool before it will actually work for us.

    However, the Python development team has been very helpful in responding to our bug reports, and early responses to these problems have been encouraging, so I expect that all of these issues will be fixed in time.

  6. Maintain 2.x compatibility for several years.

    Right now, Twisted supports python 2.3 to 2.5. Currently, we're working on 2.6 support (which we'll obviously have to finish before 3.0!). Our plan is to we revise our supported versions of Python based on the long-term supported versions of Ubuntu - release 8.04, which includes Python 2.5, will be supported until 2013. According to Guido's advice we will need to drop support for 2.5 in order to support 3.0, but I am hoping we can find a way around that (we are pretty creative with version-compatibility hacks).

    So, we are planning to support Python 2.5 until at least 2013. In two years, Ubuntu will release another long-term supported version of Ubuntu: if they still exist, and stay on schedule, that will be 10.04. Personally I am guessing that this will ship with Python 2.x, perhaps python 2.8, as /usr/bin/python, because there is a huge amount of Python software packaged with the distribution and it will take a long time to update it all. So, five years from then, in 2015, we can start looking at dropping 2.x support.

    During this period, we will continue to follow Guido's advice about migration: running 2to3 over our 2.x codebase, and modifying the 2.x codebase to keep its tests passing in both versions.

    The upshot of this is that Python 3.x will not be a source language for Twisted until well after my 35th birthday — it will be a target runtime (and a set of guidelines and restrictions) for my python 2.x code. I expect to be writing programs in Python 2.x for the next ten years or so.

So, that's the plan. I'm hoping that it ends up looking laughably conservative in a year or so; that the 3.x transition is easy as pie, and everyone rapidly upgrades. Other things could happen, too: the 2.x and 3.x branches could converge, someone might end up writing a 3to2, or another runtime (PyPy comes to mind) might allow for running 2.x and 3.x code in the same process directly, making our conversion process easier.

For the time being, however, we're assuming that, for many years, we will have people with large codebases they're maintaining (or people writing new code who want to use other libraries which have not yet been migrated) who still want new features and bug fixes in Twisted. Pretty soon I expect we will also have bleeding-edge users that want to use Twisted on python 3. I'd like to provide all of those people with a positive experience for as long as possible.


The Django project uses the library six to maintain a codebase that works simultaneously on Python 2 and Python 3 (blog post).

six does this by providing a compatibility layer that intelligently redirects imports and functions to their respective locations (as well as unifying other incompatible changes).

Obvious advantages:

  • No need for separate branches for Python 2 and Python 3
  • No conversion tools, such as 2to3.

The main idea of 2.6 is to provide a migration path to 3.0. So you can use from __future__ import X slowly migrating one feature at a time until you get all of them nailed down and can move to 3.0. Many of the 3.0 features will flow into 2.6 as well, so you can make the language gap smaller gradually rather than having to migrate everything in one go.

At work, we plan to upgrade from 2.5 to 2.6 first. Then we begin enabling 3.0 features slowly one module at a time. At some point a whole subpart of the system will probably be ready for 3.x.

The only problem are libraries. If a library is never migrated, we are stuck with the old library. But I am pretty confident that we'll get a fine alternative in due time for that part.


Speaking as a library author:

I'm waiting for the final version to be released. My belief, like that of most of the Python community, is that 2.x will continue to be the dominant version for a period of weeks or months. That's plenty of time to release a nice, polished 3.x release.

I'll be maintaining separate 2.x and 3.x branches. 2.x will be backwards compatible to 2.4, so I can't use a lot of the fancy syntax or new features in 2.6 / 3.0. In contrast, the 3.x branch will use every one of those features that results in a nicer experience for the user. The test suite will be modified so that 2to3 will work upon it, and I'll maintain the same tests for both branches.


Support both

I wanted to make an attempt at converting the BeautifulSoup library to 3x for a project I'm working on but I can see how it would be a pain to maintain two different branches of the code.

The current model to handle this include:

  1. make a change to the 2x branch
  2. run 2to3
  3. pray that it does the conversion properly the first time
  4. run the code
  5. run unit tests to verify that everything works
  6. copy the output to the 3x branch

This model works but IMHO it sucks. For every change/release you have to go through these steps ::sigh::. Plus, it discourages developers from extending the 3x branch with new features that can only be supported in py3k because you're still essentially targeting all the code to 2x.

The solution... use a preprocessor

Since I couldn't find a decent c-style preprocessor with #define and #ifdef directives for python I wrote one.

It's called pypreprocessor and can be found in the PYPI

Essentially, what you do is:

  1. import pypreprocessor
  2. detect which version of python the script is running in
  3. set a 'define' in the preprocessor for the version (ex 'python2' or 'python3')
  4. sprinkle '#ifdef python2' and '#ifdef python3' directives where the code is version specific
  5. run the code

That's it. Now it'll work in both 2x and 3x. If you are worried about added performance hit of running a preprocessor there's also a mode that will strip out all of the metadata and output the post-processed source to a file.

Best of all... you only have to do the 2to3 conversion once.

Here's the a working example:

#!/usr/bin/env python
# py2and3.py

import sys
from pypreprocessor import pypreprocessor

#exclude
if sys.version[:3].split('.')[0] == '2':
    pypreprocessor.defines.append('python2')
if sys.version[:3].split('.')[0] == '3':
    pypreprocessor.defines.append('python3')

pypreprocessor.parse()
#endexclude
#ifdef python2
print('You are using Python 2x')
#ifdef python3
print('You are using python 3x')
#else
print('Python version not supported')
#endif

These are the results in the terminal:

 python py2and3.py
 >>>You are using Python 2x 
 python3 py2and3.py
 >>>You are using python 3x

If you want to output to a file and make clean version-specific source file with no extra meta-data, add these two lines somewhere before the pypreprocessor.parse() statement:

pypreprocessor.output = outputFileName.py
pypreprocessor.removeMeta = True

Then:

python py2and3.py

Will create a file called outputFileName.py that is python 2x specific with no extra metadata.

python3 py2and3.py

Will create a file called outputFileName.py that is python 3x specific with no extra metadata.

For documentation and more examples see check out pypreprocessor on GoogleCode.

I sincerely hope this helps. I love writing code in python and I hope to see support progress into the 3x realm asap. I hate to see the language not progress. Especially, since the 3x version resolves a lot of the featured WTFs and makes the syntax look a little more friendly to users migrating from other languages.

The documentation at this point is complete but not extensive. I'll try to get the wiki up with some more extensive information soon.

Update:

Although I designed pypreprocessor specifically to solve this issue, it doesn't work because the lexer does syntax checking on all of the code before any code is executed.

If python had real C preprocessor directive support it would allow developers to write both python2x and python3k code alongside each other in the same file but due to the bad reputation of the C preprocessor (abuse of macro replacement to change language keywords) I don't see legitimate C preprocessor support being added to python any time soon.