Python 2 and Python 3 dual development

Solution 1:

In my experience, it depends on the kind of project.

If it is a library or very self contained application, a common choice is develop in Python 2.7 avoiding constructs deprecated in Python 3.x as much as possible and resort to automated tests to identify holes left by py2to3 that you will have to fix manually.

On the other side, for real life applications, be prepared to constantly stumble upon libraries that are not ported to py3k yet (sometimes important ones). Most of the time you will have no choice but port the library to Python 3, so if you can afford that, go for it. Usually I can't, that is why I'm not supporting Python 3 for this kind of project (but I struggle to write code that will be easier to port when opportune).

For unicode handling, I found this PyCon 2012 video very informative. The advice is good for both Python 2.x and 3.x: treat every string coming from outside as bytes and convert to unicode as soon as possible and output strings converting to bytes as late as possible. There is another very informative video about date/time handling.

[update]

This is an old answer. As of today (2019) there is no good rationale to start a project using Python 2.x and there are several compelling reasons to port older projects to Python 3.7+ and abandon support for Python 2.x.

Solution 2:

In my experience it is better to not to use a library like six; I instead have a single compat.py for each package with just the needed code, not unlike Scott Griffiths's approach. six has also the burden of trying to support long-gone Python versions: the truth is that life is much easier when you accept that Pythons <=2.6 and <=3.2 are gone. In 2.7 there are backported compatibility features such as .view* methods on dicts that work exactly like their non-prefixed versions on Python 3; and Python 3.3 on the other hand supports u prefix on unicode strings again.

Even for very substantial packages, a compat.py module, which allows other code to work unchanged, can be quite short: here's an example from the pika package that my colleagues and I helped to make 2/3 polyglot. Pika is one of those projects that had really messed internals mixing unicode and 8 bit strings with each other, but now we've used it in production on Python 3 for well over 6 months without problems.


Other important thing is to always use the following __future__s when developing:

from __future__ import absolute_import, division, print_function

I recommend against using unicode_literals, because there are some strings that need to be of the type called str on either platform. If you don't use unicode_literals, you can do the following:

  • b'123' is the 8-bit string literal
  • '123' is of the type str on both platforms
  • u'123' is proper unicode text on both platforms

In any case, please do not do 2to3 on installation/package build time; some packages used to do that in the past - pip installing those packages took some seconds on Python 2, but closer to minute on Python 3.

Solution 3:

Pick 2 or 3, whichever is your favorite flavor, and make it work really well in that, with unit tests. Then make sure those tests work after running it through py2to3 or py3to2. Better to maintain one version of code.

Solution 4:

My personal experience has been that it's easier to write code that works unchanged in both Python 2 and 3, rather than rely on 2to3/3to2 scripts which often can't quite get the translation right.

Maybe my situation is unusual as I'm doing lots with byte types and 2to3 has a hard task converting these, but the convenience of having one code base outweighs the nastiness of having a few hacks in the code.

As a concrete example, my bitstring module was an early convert to Python 3 and the same code is used for Python 2.6/2.7/3.x. The source is over 4000 lines of code and this is this bit I needed to get it to work for the different major versions:

# For Python 2.x/ 3.x coexistence
# Yes this is very very hacky.
try:
    xrange
    for i in range(256):
        BYTE_REVERSAL_DICT[i] = chr(int("{0:08b}".format(i)[::-1], 2))
except NameError:
    for i in range(256):
        BYTE_REVERSAL_DICT[i] = bytes([int("{0:08b}".format(i)[::-1], 2)])
    from io import IOBase as file
    xrange = range
    basestring = str

OK, that's not pretty, but it means that I can write 99% of the code in good Python 2 style and all the unit tests still pass for the same code in Python 3. This route isn't for everyone, but it is an option to consider.