python remove duplicates from 2 lists

Solution 1:

Here is what's going on. Suppose you have this list:

['a', 'b', 'c', 'd']

and you are looping over every element in the list. Suppose you are currently at index position 1:

['a', 'b', 'c', 'd']
       ^
       |
   index = 1

...and you remove the element at index position 1, giving you this:

['a',      'c', 'd']
       ^
       |
    index 1

After removing the item, the other items slide to the left, giving you this:

['a', 'c', 'd']
       ^
       |
    index 1

Then when the loop runs again, the loop increments the index to 2, giving you this:

['a', 'c', 'd']
            ^ 
            |
         index = 2

See how you skipped over 'c'? The lesson is: never delete an element from a list that you are looping over.

Solution 2:

Your problem seems to be that you're changing the list you're iterating over. Iterate over a copy of the list instead.

for i in b[:]:
    if i in a:
        b.remove(i)


>>> b
['123', '456']

But, How about using a list comprehension instead?

>>> a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
>>> b = ["ijk", "lmn", "opq", "rst", "123", "456", ]
>>> [elem for elem in b if elem not in a ]
['123', '456']

Solution 3:

What about

b= set(b) - set(a)

If you need possible repetitions in b to also appear repeated in the result and/or order to be preserved, then

b= [ x for x in b if not x in a ]

would do.

Solution 4:

You asked to remove both the lists duplicates, here's my solution:

from collections import OrderedDict
a = ["abc", "def", "ijk", "lmn", "opq", "rst", "xyz"]
b = ["ijk", "lmn", "opq", "rst", "123", "456", ]

x = OrderedDict.fromkeys(a)
y = OrderedDict.fromkeys(b)

for k in x:
    if k in y:
        x.pop(k)
        y.pop(k)


print x.keys()
print y.keys()

Result:

['abc', 'def', 'xyz']
['123', '456']

The nice thing here is that you keep the order of both lists items

Solution 5:

or a set

set(b).difference(a)

be forewarned sets will not preserve order if that is important