What is the meaning of list[:] in this code? [duplicate]

This code is from Python's Documentation. I'm a little confused.

words = ['cat', 'window', 'defenestrate']
for w in words[:]:
    if len(w) > 6:
        words.insert(0, w)
print(words)

And the following is what I thought at first:

words = ['cat', 'window', 'defenestrate']
for w in words:
    if len(w) > 6:
        words.insert(0, w)
print(words)

Why does this code create a infinite loop and the first one doesn't?


Solution 1:

This is one of the gotchas! of python, that can escape beginners.

The words[:] is the magic sauce here.

Observe:

>>> words =  ['cat', 'window', 'defenestrate']
>>> words2 = words[:]
>>> words2.insert(0, 'hello')
>>> words2
['hello', 'cat', 'window', 'defenestrate']
>>> words
['cat', 'window', 'defenestrate']

And now without the [:]:

>>> words =  ['cat', 'window', 'defenestrate']
>>> words2 = words
>>> words2.insert(0, 'hello')
>>> words2
['hello', 'cat', 'window', 'defenestrate']
>>> words
['hello', 'cat', 'window', 'defenestrate']

The main thing to note here is that words[:] returns a copy of the existing list, so you are iterating over a copy, which is not modified.

You can check whether you are referring to the same lists using id():

In the first case:

>>> words2 = words[:]
>>> id(words2)
4360026736
>>> id(words)
4360188992
>>> words2 is words
False

In the second case:

>>> id(words2)
4360188992
>>> id(words)
4360188992
>>> words2 is words
True

It is worth noting that [i:j] is called the slicing operator, and what it does is it returns a fresh copy of the list starting from index i, upto (but not including) index j.

So, words[0:2] gives you

>>> words[0:2]
['hello', 'cat']

Omitting the starting index means it defaults to 0, while omitting the last index means it defaults to len(words), and the end result is that you receive a copy of the entire list.


If you want to make your code a little more readable, I recommend the copy module.

from copy import copy 

words = ['cat', 'window', 'defenestrate']
for w in copy(words):
    if len(w) > 6:
        words.insert(0, w)
print(words)

This basically does the same thing as your first code snippet, and is much more readable.

Alternatively (as mentioned by DSM in the comments) and on python >=3, you may also use words.copy() which does the same thing.

Solution 2:

words[:] copies all the elements in words into a new list. So when you iterate over words[:], you're actually iterating over all the elements that words currently has. So when you modify words, the effects of those modifications are not visible in words[:] (because you called on words[:] before starting to modify words)

In the latter example, you are iterating over words, which means that any changes you make to words is indeed visible to your iterator. As a result, when you insert into index 0 of words, you "bump up" every other element in words by one index. So when you move on to the next iteration of your for-loop, you'll get the element at the next index of words, but that's just the element that you just saw (because you inserted an element at the beginning of the list, moving all the other element up by an index).

To see this in action, try the following code:

words = ['cat', 'window', 'defenestrate']
for w in words:
    print("The list is:", words)
    print("I am looking at this word:", w)
    if len(w) > 6:
        print("inserting", w)
        words.insert(0, w)
        print("the list now looks like this:", words)
print(words)

Solution 3:

(In addition to @Coldspeed answer)

Look at the below examples:

words = ['cat', 'window', 'defenestrate']
words2 = words
words2 is words

results: True

It means names word and words2 refer to the same object.

words = ['cat', 'window', 'defenestrate']
words2 = words[:]
words2 is words

results: False

In this case, we have created the new object.

Solution 4:

Let's have a look at iterator and iterables:

An iterable is an object that has an __iter__ method which returns an iterator, or which defines a __getitem__ method that can take sequential indexes starting from zero (and raises an IndexError when the indexes are no longer valid). So an iterable is an object that you can get an iterator from.

An iterator is an object with a next (Python 2) or __next__ (Python 3) method.

iter(iterable) returns iterator object, and list_obj[:] returns a new list object, exact copy of list_object.

In your first case:

for w in words[:]

The for loop will iterate over new copy of the list not the original words. Any change in words has no effect on loop iteration, and the loop terminates normally.

This is how the loop does its work:

  1. loop calls iter method on iterable and iterates over the iterator

  2. loop calls next method on iterator object to get next item from iterator. This step is repeated until there are no more elements left

  3. loop terminates when a StopIteration exception is raised.

In your second case:

words = ['cat', 'window', 'defenestrate']
for w in words:
    if len(w) > 6:
        words.insert(0, w)
print(words)

You are iterating over the original list words and adding elements to words have a direct impact on the iterator object. So every time your words is updated, the corresponding iterator object is also updated and therefore creates an infinite loop.

Look at this:

>>> l = [2, 4, 6, 8]
>>> i = iter(l) # returns list_iterator object which has next method
>>> next(i)
2
>>> next(i)
4
>>> l.insert(2, 'A')
>>> next(i)
'A'

Every time you update your original list before StopIteration you will get the updated iterator and next returns accordingly. That's why your loop runs infinitely.

For more on iteration and the iteration protocol you can look here.