Remove all values within one list from another list? [duplicate]
Solution 1:
>>> a = range(1, 10)
>>> [x for x in a if x not in [2, 3, 7]]
[1, 4, 5, 6, 8, 9]
Solution 2:
I was looking for fast way to do the subject, so I made some experiments with suggested ways. And I was surprised by results, so I want to share it with you.
Experiments were done using pythonbenchmark tool and with
a = range(1,50000) # Source list
b = range(1,15000) # Items to remove
Results:
def comprehension(a, b):
return [x for x in a if x not in b]
5 tries, average time 12.8 sec
def filter_function(a, b):
return filter(lambda x: x not in b, a)
5 tries, average time 12.6 sec
def modification(a,b):
for x in b:
try:
a.remove(x)
except ValueError:
pass
return a
5 tries, average time 0.27 sec
def set_approach(a,b):
return list(set(a)-set(b))
5 tries, average time 0.0057 sec
Also I made another measurement with bigger inputs size for the last two functions
a = range(1,500000)
b = range(1,100000)
And the results:
For modification (remove method) - average time is 252 seconds For set approach - average time is 0.75 seconds
So you can see that approach with sets is significantly faster than others. Yes, it doesn't keep similar items, but if you don't need it - it's for you. And there is almost no difference between list comprehension and using filter function. Using 'remove' is ~50 times faster, but it modifies source list. And the best choice is using sets - it's more than 1000 times faster than list comprehension!
Solution 3:
If you don't have repeated values, you could use set difference.
x = set(range(10))
y = x - set([2, 3, 7])
# y = set([0, 1, 4, 5, 6, 8, 9])
and then convert back to list, if needed.
Solution 4:
a = range(1,10)
itemsToRemove = set([2, 3, 7])
b = filter(lambda x: x not in itemsToRemove, a)
or
b = [x for x in a if x not in itemsToRemove]
Don't create the set inside the lambda
or inside the comprehension. If you do, it'll be recreated on every iteration, defeating the point of using a set at all.
Solution 5:
The simplest way is
>>> a = range(1, 10)
>>> for x in [2, 3, 7]:
... a.remove(x)
...
>>> a
[1, 4, 5, 6, 8, 9]
One possible problem here is that each time you call remove(), all the items are shuffled down the list to fill the hole. So if a
grows very large this will end up being quite slow.
This way builds a brand new list. The advantage is that we avoid all the shuffling of the first approach
>>> removeset = set([2, 3, 7])
>>> a = [x for x in a if x not in removeset]
If you want to modify a
in place, just one small change is required
>>> removeset = set([2, 3, 7])
>>> a[:] = [x for x in a if x not in removeset]