Remove all values within one list from another list? [duplicate]

Solution 1:

>>> a = range(1, 10)
>>> [x for x in a if x not in [2, 3, 7]]
[1, 4, 5, 6, 8, 9]

Solution 2:

I was looking for fast way to do the subject, so I made some experiments with suggested ways. And I was surprised by results, so I want to share it with you.

Experiments were done using pythonbenchmark tool and with

a = range(1,50000) # Source list
b = range(1,15000) # Items to remove

Results:

 def comprehension(a, b):
     return [x for x in a if x not in b]

5 tries, average time 12.8 sec

def filter_function(a, b):
    return filter(lambda x: x not in b, a)

5 tries, average time 12.6 sec

def modification(a,b):
    for x in b:
        try:
            a.remove(x)
        except ValueError:
            pass
    return a

5 tries, average time 0.27 sec

def set_approach(a,b):
    return list(set(a)-set(b))

5 tries, average time 0.0057 sec

Also I made another measurement with bigger inputs size for the last two functions

a = range(1,500000)
b = range(1,100000)

And the results:

For modification (remove method) - average time is 252 seconds For set approach - average time is 0.75 seconds

So you can see that approach with sets is significantly faster than others. Yes, it doesn't keep similar items, but if you don't need it - it's for you. And there is almost no difference between list comprehension and using filter function. Using 'remove' is ~50 times faster, but it modifies source list. And the best choice is using sets - it's more than 1000 times faster than list comprehension!

Solution 3:

If you don't have repeated values, you could use set difference.

x = set(range(10))
y = x - set([2, 3, 7])
# y = set([0, 1, 4, 5, 6, 8, 9])

and then convert back to list, if needed.

Solution 4:

a = range(1,10)
itemsToRemove = set([2, 3, 7])
b = filter(lambda x: x not in itemsToRemove, a)

or

b = [x for x in a if x not in itemsToRemove]

Don't create the set inside the lambda or inside the comprehension. If you do, it'll be recreated on every iteration, defeating the point of using a set at all.

Solution 5:

The simplest way is

>>> a = range(1, 10)
>>> for x in [2, 3, 7]:
...  a.remove(x)
... 
>>> a
[1, 4, 5, 6, 8, 9]

One possible problem here is that each time you call remove(), all the items are shuffled down the list to fill the hole. So if a grows very large this will end up being quite slow.

This way builds a brand new list. The advantage is that we avoid all the shuffling of the first approach

>>> removeset = set([2, 3, 7])
>>> a = [x for x in a if x not in removeset]

If you want to modify a in place, just one small change is required

>>> removeset = set([2, 3, 7])
>>> a[:] = [x for x in a if x not in removeset]