Iterating over dictionary items(), values(), keys() in Python 3

If I understand correctly, in Python 2, iter(d.keys()) was the same as d.iterkeys(). But now, d.keys() is a view, which is in between the list and the iterator. What's the difference between a view and an iterator?

In other words, in Python 3, what's the difference between

for k in d.keys()
    f(k)

and

for k in iter(d.keys())
    f(k)

Also, how do these differences show up in a simple for loop (if at all)?


Solution 1:

I'm not sure if this is quite an answer to your questions but hopefully it explains a bit about the difference between Python 2 and 3 in this regard.

In Python 2, iter(d.keys()) and d.iterkeys() are not quite equivalent, although they will behave the same. In the first, keys() will return a copy of the dictionary's list of keys and iter will then return an iterator object over this list, with the second a copy of the full list of keys is never built.

The view objects returned by d.keys() in Python 3 are iterable (i.e. an iterator can be made from them) so when you say for k in d.keys() Python will create the iterator for you. Therefore your two examples will behave the same.

The significance in the change of the return type for keys() is that the Python 3 view object is dynamic. i.e. if we say ks = d.keys() and later add to d then ks will reflect this. In Python 2, keys() returns a list of all the keys currently in the dict. Compare:

Python 3

>>> d = { "first" : 1, "second" : 2 }
>>> ks = d.keys()
>>> ks
dict_keys(['second', 'first'])
>>> d["third"] = 3
>>> ks
dict_keys(['second', 'third', 'first'])

Python 2.x

>>> d = { "first" : 1, "second" : 2 }
>>> ks = d.keys()
>>> ks
['second', 'first']
>>> d["third"] = 3
>>> ks
['second', 'first']

As Python 3's keys() returns the dynamic object Python 3 doesn't have (and has no need for) a separate iterkeys method.

Further clarification

In Python 3, keys() returns a dict_keys object but if we use it in a for loop context for k in d.keys() then an iterator is implicitly created. So the difference between for k in d.keys() and for k in iter(d.keys()) is one of implicit vs. explicit creation of the iterator.

In terms of another difference, whilst they are both dynamic, remember if we create an explicit iterator then it can only be used once whereas the view can be reused as required. e.g.

>>> ks = d.keys()
>>> 'first' in ks
True
>>> 'second' in ks
True
>>> i = iter(d.keys())
>>> 'first' in i
True
>>> 'second' in i
False             # because we've already reached the end of the iterator

Also, notice that if we create an explicit iterator and then modify the dict then the iterator is invalidated:

>>> i2 = iter(d.keys())
>>> d['fourth'] = 4
>>> for k in i2: print(k)
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration

In Python 2, given the existing behaviour of keys a separate method was needed to provide a way to iterate without copying the list of keys whilst still maintaining backwards compatibility. Hence iterkeys()