Version number comparison in Python
I want to write a cmp
-like function that compares two version numbers and returns -1
, 0
, or 1
based on their compared values.
- Return
-1
if version A is older than version B - Return
0
if versions A and B are equivalent - Return
1
if version A is newer than version B
Each subsection is supposed to be interpreted as a number, therefore 1.10 > 1.1.
Desired function outputs are
mycmp('1.0', '1') == 0
mycmp('1.0.0', '1') == 0
mycmp('1', '1.0.0.1') == -1
mycmp('12.10', '11.0.0.0.0') == 1
...
And here is my implementation, open for improvement:
def mycmp(version1, version2):
parts1 = [int(x) for x in version1.split('.')]
parts2 = [int(x) for x in version2.split('.')]
# fill up the shorter version with zeros ...
lendiff = len(parts1) - len(parts2)
if lendiff > 0:
parts2.extend([0] * lendiff)
elif lendiff < 0:
parts1.extend([0] * (-lendiff))
for i, p in enumerate(parts1):
ret = cmp(p, parts2[i])
if ret: return ret
return 0
I'm using Python 2.4.5 btw. (installed at my working place ...).
Here's a small 'test suite' you can use
assert mycmp('1', '2') == -1
assert mycmp('2', '1') == 1
assert mycmp('1', '1') == 0
assert mycmp('1.0', '1') == 0
assert mycmp('1', '1.000') == 0
assert mycmp('12.01', '12.1') == 0
assert mycmp('13.0.1', '13.00.02') == -1
assert mycmp('1.1.1.1', '1.1.1.1') == 0
assert mycmp('1.1.1.2', '1.1.1.1') == 1
assert mycmp('1.1.3', '1.1.3.000') == 0
assert mycmp('3.1.1.0', '3.1.2.10') == -1
assert mycmp('1.1', '1.10') == -1
Solution 1:
How about using Python's distutils.version.StrictVersion
?
>>> from distutils.version import StrictVersion
>>> StrictVersion('10.4.10') > StrictVersion('10.4.9')
True
So for your cmp
function:
>>> cmp = lambda x, y: StrictVersion(x).__cmp__(y)
>>> cmp("10.4.10", "10.4.11")
-1
If you want to compare version numbers that are more complex distutils.version.LooseVersion
will be more useful, however be sure to only compare the same types.
>>> from distutils.version import LooseVersion, StrictVersion
>>> LooseVersion('1.4c3') > LooseVersion('1.3')
True
>>> LooseVersion('1.4c3') > StrictVersion('1.3') # different types
False
LooseVersion
isn't the most intelligent tool, and can easily be tricked:
>>> LooseVersion('1.4') > LooseVersion('1.4-rc1')
False
To have success with this breed, you'll need to step outside the standard library and use setuptools's parsing utility parse_version
.
>>> from pkg_resources import parse_version
>>> parse_version('1.4') > parse_version('1.4-rc2')
True
So depending on your specific use-case, you'll need to decide whether the builtin distutils
tools are enough, or if it's warranted to add as a dependency setuptools
.
Solution 2:
Remove the uninteresting part of the string (trailing zeroes and dots), and then compare the lists of numbers.
import re
def mycmp(version1, version2):
def normalize(v):
return [int(x) for x in re.sub(r'(\.0+)*$','', v).split(".")]
return cmp(normalize(version1), normalize(version2))
This is the same approach as Pär Wieslander, but a bit more compact:
Here are some tests, thanks to "How to compare two strings in dot separated version format in Bash?":
assert mycmp("1", "1") == 0
assert mycmp("2.1", "2.2") < 0
assert mycmp("3.0.4.10", "3.0.4.2") > 0
assert mycmp("4.08", "4.08.01") < 0
assert mycmp("3.2.1.9.8144", "3.2") > 0
assert mycmp("3.2", "3.2.1.9.8144") < 0
assert mycmp("1.2", "2.1") < 0
assert mycmp("2.1", "1.2") > 0
assert mycmp("5.6.7", "5.6.7") == 0
assert mycmp("1.01.1", "1.1.1") == 0
assert mycmp("1.1.1", "1.01.1") == 0
assert mycmp("1", "1.0") == 0
assert mycmp("1.0", "1") == 0
assert mycmp("1.0", "1.0.1") < 0
assert mycmp("1.0.1", "1.0") > 0
assert mycmp("1.0.2.0", "1.0.2") == 0
Solution 3:
Is reuse considered elegance in this instance? :)
# pkg_resources is in setuptools
# See http://peak.telecommunity.com/DevCenter/PkgResources#parsing-utilities
def mycmp(a, b):
from pkg_resources import parse_version as V
return cmp(V(a),V(b))
Solution 4:
No need to iterate over the version tuples. The built in comparison operator on lists and tuples already works exactly like you want it. You'll just need to zero extend the version lists to the corresponding length. With python 2.6 you can use izip_longest to pad the sequences.
from itertools import izip_longest
def version_cmp(v1, v2):
parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
parts1, parts2 = zip(*izip_longest(parts1, parts2, fillvalue=0))
return cmp(parts1, parts2)
With lower versions, some map hackery is required.
def version_cmp(v1, v2):
parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
parts1, parts2 = zip(*map(lambda p1,p2: (p1 or 0, p2 or 0), parts1, parts2))
return cmp(parts1, parts2)
Solution 5:
This is a little more compact than your suggestion. Rather than filling the shorter version with zeros, I'm removing trailing zeros from the version lists after splitting.
def normalize_version(v):
parts = [int(x) for x in v.split(".")]
while parts[-1] == 0:
parts.pop()
return parts
def mycmp(v1, v2):
return cmp(normalize_version(v1), normalize_version(v2))