How does Python 2 compare string and int? Why do lists compare as greater than numbers, and tuples greater than lists?
The following snippet is annotated with the output (as seen on ideone.com):
print "100" < "2" # True
print "5" > "9" # False
print "100" < 2 # False
print 100 < "2" # True
print 5 > "9" # False
print "5" > 9 # True
print [] > float('inf') # True
print () > [] # True
Can someone explain why the output is as such?
Implementation details
- Is this behavior mandated by the language spec, or is it up to implementors?
- Are there differences between any of the major Python implementations?
- Are there differences between versions of the Python language?
From the python 2 manual:
CPython implementation detail: Objects of different types except numbers are ordered by their type names; objects of the same types that don’t support proper comparison are ordered by their address.
When you order two strings or two numeric types the ordering is done in the expected way (lexicographic ordering for string, numeric ordering for integers).
When you order a numeric and a non-numeric type, the numeric type comes first.
>>> 5 < 'foo'
True
>>> 5 < (1, 2)
True
>>> 5 < {}
True
>>> 5 < [1, 2]
True
When you order two incompatible types where neither is numeric, they are ordered by the alphabetical order of their typenames:
>>> [1, 2] > 'foo' # 'list' < 'str'
False
>>> (1, 2) > 'foo' # 'tuple' > 'str'
True
>>> class Foo(object): pass
>>> class Bar(object): pass
>>> Bar() < Foo()
True
One exception is old-style classes that always come before new-style classes.
>>> class Foo: pass # old-style
>>> class Bar(object): pass # new-style
>>> Bar() < Foo()
False
Is this behavior mandated by the language spec, or is it up to implementors?
There is no language specification. The language reference says:
Otherwise, objects of different types always compare unequal, and are ordered consistently but arbitrarily.
So it is an implementation detail.
Are there differences between any of the major Python implementations?
I can't answer this one because I have only used the official CPython implementation, but there are other implementations of Python such as PyPy.
Are there differences between versions of the Python language?
In Python 3.x the behaviour has been changed so that attempting to order an integer and a string will raise an error:
>>> '10' > 5
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
'10' > 5
TypeError: unorderable types: str() > int()
Strings are compared lexicographically, and dissimilar types are compared by the name of their type ("int"
< "string"
). 3.x fixes the second point by making them non-comparable.