What does x[x < 2] = 0 mean in Python?
Solution 1:
This only makes sense with NumPy arrays. The behavior with lists is useless, and specific to Python 2 (not Python 3). You may want to double-check if the original object was indeed a NumPy array (see further below) and not a list.
But in your code here, x is a simple list.
Since
x < 2
is False i.e 0, therefore
x[x<2]
is x[0]
x[0]
gets changed.
Conversely, x[x>2]
is x[True]
or x[1]
So, x[1]
gets changed.
Why does this happen?
The rules for comparison are:
When you order two strings or two numeric types the ordering is done in the expected way (lexicographic ordering for string, numeric ordering for integers).
When you order a numeric and a non-numeric type, the numeric type comes first.
When you order two incompatible types where neither is numeric, they are ordered by the alphabetical order of their typenames:
So, we have the following order
numeric < list < string < tuple
See the accepted answer for How does Python compare string and int?.
If x is a NumPy array, then the syntax makes more sense because of boolean array indexing. In that case, x < 2
isn't a boolean at all; it's an array of booleans representing whether each element of x
was less than 2. x[x < 2] = 0
then selects the elements of x
that were less than 2 and sets those cells to 0. See Indexing.
>>> x = np.array([1., -1., -2., 3])
>>> x < 0
array([False, True, True, False], dtype=bool)
>>> x[x < 0] += 20 # All elements < 0 get increased by 20
>>> x
array([ 1., 19., 18., 3.]) # Only elements < 0 are affected
Solution 2:
>>> x = [1,2,3,4,5]
>>> x<2
False
>>> x[False]
1
>>> x[True]
2
The bool is simply converted to an integer. The index is either 0 or 1.
Solution 3:
The original code in your question works only in Python 2. If x
is a list
in Python 2, the comparison x < y
is False
if y
is an int
eger. This is because it does not make sense to compare a list with an integer. However in Python 2, if the operands are not comparable, the comparison is based in CPython on the alphabetical ordering of the names of the types; additionally all numbers come first in mixed-type comparisons. This is not even spelled out in the documentation of CPython 2, and different Python 2 implementations could give different results. That is [1, 2, 3, 4, 5] < 2
evaluates to False
because 2
is a number and thus "smaller" than a list
in CPython. This mixed comparison was eventually deemed to be too obscure a feature, and was removed in Python 3.0.
Now, the result of <
is a bool
; and bool
is a subclass of int
:
>>> isinstance(False, int)
True
>>> isinstance(True, int)
True
>>> False == 0
True
>>> True == 1
True
>>> False + 5
5
>>> True + 5
6
So basically you're taking the element 0 or 1 depending on whether the comparison is true or false.
If you try the code above in Python 3, you will get TypeError: unorderable types: list() < int()
due to a change in Python 3.0:
Ordering Comparisons
Python 3.0 has simplified the rules for ordering comparisons:
The ordering comparison operators (
<
,<=
,>=
,>
) raise aTypeError
exception when the operands don’t have a meaningful natural ordering. Thus, expressions like1 < ''
,0 > None
orlen <= len
are no longer valid, and e.g.None < None
raisesTypeError
instead of returningFalse
. A corollary is that sorting a heterogeneous list no longer makes sense – all the elements must be comparable to each other. Note that this does not apply to the==
and!=
operators: objects of different incomparable types always compare unequal to each other.
There are many datatypes that overload the comparison operators to do something different (dataframes from pandas, numpy's arrays). If the code that you were using did something else, it was because x
was not a list
, but an instance of some other class with operator <
overridden to return a value that is not a bool
; and this value was then handled specially by x[]
(aka __getitem__
/__setitem__
)