Why can't I iterate twice over the same data?

It's because data is an iterator, and you can consume an iterator only once. For example:

lst = [1, 2, 3]
it = iter(lst)

next(it)
=> 1
next(it)
=> 2
next(it)
=> 3
next(it)
=> StopIteration

If we are traversing some data using a for loop, that last StopIteration will cause it to exit the first time. If we try to iterate over it again, we'll keep getting the StopIteration exception, because the iterator has already been consumed.

Now for the second question: What if we do need to traverse the iterator more than once? A simple solution would be to create a list with the elements, and we can traverse it as many times as needed. This is all right as long as there are few elements in the list:

data = list(db[3])

But if there are many elements, it's a better idea to create independent iterators using tee():

import itertools
it1, it2 = itertools.tee(db[3], n=2) # create as many as needed

Now we can loop over each one in turn:

for e in it1:
    print("doing this one time")

for e in it2:
    print("doing this two times")

Once an iterator is exhausted, it will not yield any more.

>>> it = iter([3, 1, 2])
>>> for x in it: print(x)
...
3
1
2
>>> for x in it: print(x)
...
>>>

I want to complete the answer of @ÓscarLópez for them who looks for a solution in 2017 and uses python 2.7 or 3.

Method tee() takes no keyword arguments now and waits for the second argument an integer, not keyword. This is the right way to use tee():

import itertools
it1, it2 = itertools.tee(db[3], 2)

Why can't I iterate twice over the same data?

Related

Recent Posts