Why does defining __getitem__ on a class make it iterable in python?

Why does defining __getitem__ on a class make it iterable?

For instance if I write:

class b:
  def __getitem__(self, k):
    return k

cb = b()

for k in cb:
  print k

I get the output:

0
1
2
3
4
5
6
7
8
...

I would really expect to see an error returned from "for k in cb:"


Iteration's support for __getitem__ can be seen as a "legacy feature" which allowed smoother transition when PEP234 introduced iterability as a primary concept. It only applies to classes without __iter__ whose __getitem__ accepts integers 0, 1, &c, and raises IndexError once the index gets too high (if ever), typically "sequence" classes coded before __iter__ appeared (though nothing stops you from coding new classes this way too).

Personally, I would rather not rely on this in new code, though it's not deprecated nor is it going away (works fine in Python 3 too), so this is just a matter of style and taste ("explicit is better than implicit" so I'd rather explicitly support iterability rather than rely on __getitem__ supporting it implicitly for me -- but, not a bigge).


If you take a look at PEP234 defining iterators, it says:

1. An object can be iterated over with "for" if it implements
   __iter__() or __getitem__().

2. An object can function as an iterator if it implements next().

__getitem__ predates the iterator protocol, and was in the past the only way to make things iterable. As such, it's still supported as a method of iterating. Essentially, the protocol for iteration is:

  1. Check for an __iter__ method. If it exists, use the new iteration protocol.

  2. Otherwise, try calling __getitem__ with successively larger integer values until it raises IndexError.

(2) used to be the only way of doing this, but had the disadvantage that it assumed more than was needed to support just iteration. To support iteration, you had to support random access, which was much more expensive for things like files or network streams where going forwards was easy, but going backwards would require storing everything. __iter__ allowed iteration without random access, but since random access usually allows iteration anyway, and because breaking backward compatability would be bad, __getitem__ is still supported.


Special methods such as __getitem__ add special behaviors to objects, including iteration.

http://docs.python.org/reference/datamodel.html#object.getitem

"for loops expect that an IndexError will be raised for illegal indexes to allow proper detection of the end of the sequence."

Raise IndexError to signal the end of the sequence.

Your code is basically equivalent to:

i = 0
while True:
    try:
        yield object[i]
        i += 1
    except IndexError:
        break

Where object is what you're iterating over in the for loop.


This is so for historical reasons. Prior to Python 2.2 __getitem__ was the only way to create a class that could be iterated over with the for loop. In 2.2 the __iter__ protocol was added but to retain backwards compatibility __getitem__ still works in for loops.