Loop "Forgets" to Remove Some Items [duplicate]

Solution 1:

You're modifying the list you're iterating over, which is bound to result in some unintuitive behavior. Instead, make a copy of the list so you don't remove elements from what you're iterating through.

for char in textlist[:]: #shallow copy of the list
    # etc

To clarify the behavior you're seeing, check this out. Put print char, textlist at the beginning of your (original) loop. You'd expect, perhaps, that this would print out your string vertically, alongside the list, but what you'll actually get is this:

H ['H', 'e', 'y', ' ', 'l', 'o', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
e ['H', 'e', 'y', ' ', 'l', 'o', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
  ['H', 'y', ' ', 'l', 'o', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!'] # !
l ['H', 'y', ' ', 'l', 'o', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
o ['H', 'y', ' ', 'l', 'o', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
k ['H', 'y', ' ', 'l', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!'] # Problem!!
  ['H', 'y', ' ', 'l', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
W ['H', 'y', ' ', 'l', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
o ['H', 'y', ' ', 'l', 'o', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!'] 
d ['H', 'y', ' ', 'l', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
s ['H', 'y', ' ', 'l', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
! ['H', 'y', ' ', 'l', 'k', ' ', 'W', 'o', 'r', 'd', 's', '!']
Hy lk Words!

So what's going on? The nice for x in y loop in Python is really just syntactic sugar: it still accesses list elements by index. So when you remove elements from the list while iterating over it, you start skipping values (as you can see above). As a result, you never see the second o in "look"; you skip over it because the index has advanced "past" it when you deleted the previous element. Then, when you get to the o in "Words", you go to remove the first occurrence of 'o', which is the one you skipped before.


As others have mentioned, list comprehensions are probably an even better (cleaner, clearer) way to do this. Make use of the fact that Python strings are iterable:

def remove_vowels(text): # function names should start with verbs! :)
    return ''.join(ch for ch in text if ch.lower() not in 'aeiou')

Solution 2:

Other answers tell you why for skips items as you alter the list. This answer tells you how you should remove characters in a string without an explicit loop, instead.

Use str.translate():

vowels = 'aeiou'
vowels += vowels.upper()
text.translate(None, vowels)

This deletes all characters listed in the second argument.

Demo:

>>> text = "Hey look Words!"
>>> vowels = 'aeiou'
>>> vowels += vowels.upper()
>>> text.translate(None, vowels)
'Hy lk Wrds!'
>>> text = 'The Quick Brown Fox Jumps Over The Lazy Fox'
>>> text.translate(None, vowels)
'Th Qck Brwn Fx Jmps vr Th Lzy Fx'

In Python 3, the str.translate() method (Python 2: unicode.translate()) differs in that it doesn't take a deletechars parameter; the first argument is a dictionary mapping Unicode ordinals (integer values) to new values instead. Use None for any character that needs to be deleted:

# Python 3 code
vowels = 'aeiou'
vowels += vowels.upper()
vowels_table = dict.fromkeys(map(ord, vowels))
text.translate(vowels_table)

You can also use the str.maketrans() static method to produce that mapping:

vowels = 'aeiou'
vowels += vowels.upper()
text.translate(text.maketrans('', '', vowels))

Solution 3:

Quoting from the docs:

Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,

for x in a[:]:
    if x < 0: a.remove(x)

Iterate over a shallow copy of the list using [:]. You're modifying a list while iterating over it, this will result in some letters being missed.

The for loop keeps track of index, so when you remove an item at index i, the next item at i+1th position shifts to the current index(i) and hence in the next iteration you'll actually pick the i+2th item.

Lets take an easy example:

>>> text = "whoops"
>>> textlist = list(text)
>>> textlist
['w', 'h', 'o', 'o', 'p', 's']
for char in textlist:
    if char.lower() in 'aeiou':
        textlist.remove(char)

Iteration 1 : Index = 0.

char = 'W' as it is at index 0. As it doesn't satisfies that condition you'll do noting.

Iteration 2 : Index = 1.

char = 'h' as it is at index 1. Nothing more to do here.

Iteration 3 : Index = 2.

char = 'o' as it is at index 2. As this item satisfies the condition so it'll be removed from the list and all the items to it's right will shift one place to the left to fill the gap.

now textlist becomes :

   0    1    2    3    4
`['w', 'h', 'o', 'p', 's']`

As you can see the other 'o' moved to index 2, i.e the current index so it'll be skipped in the next iteration. So, this is the reason some items are bring skipped in your iteration. Whenever you remove an item the next item is skipped from the iteration.

Iteration 4 : Index = 3.

char = 'p' as it is at index 3.

....


Fix:

Iterate over a shallow copy of the list to fix this issue:

for char in textlist[:]:        #note the [:]
    if char.lower() in 'aeiou':
        textlist.remove(char)

Other alternatives:

List comprehension:

A one-liner using str.join and a list comprehension:

vowels = 'aeiou'
text = "Hey look Words!"
return "".join([char for char in text if char.lower() not in vowels])

regex:

>>> import re
>>> text = "Hey look Words!"
>>> re.sub('[aeiou]', '', text, flags=re.I)
'Hy lk Wrds!'

Solution 4:

You're modifying the data you're iterating over. Don't do that.

''.join(x for x in textlist in x not in VOWELS)