add vs update in set operations in python

Solution 1:


set.add adds an individual element to the set. So,

>>> a = set()
>>> a.add(1)
>>> a

works, but it cannot work with an iterable, unless it is hashable. That is the reason why a.add([1, 2]) fails.

>>> a.add([1, 2])
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: unhashable type: 'list'

Here, [1, 2] is treated as the element being added to the set and as the error message says, a list cannot be hashed but all the elements of a set are expected to be hashables. Quoting the documentation,

Return a new set or frozenset object whose elements are taken from iterable. The elements of a set must be hashable.


In case of set.update, you can pass multiple iterables to it and it will iterate all iterables and will include the individual elements in the set. Remember: It can accept only iterables. That is why you are getting an error when you try to update it with 1

>>> a.update(1)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: 'int' object is not iterable

But, the following would work because the list [1] is iterated and the elements of the list are added to the set.

>>> a.update([1])
>>> a

set.update is basically an equivalent of in-place set union operation. Consider the following cases

>>> set([1, 2]) | set([3, 4]) | set([1, 3])
set([1, 2, 3, 4])
>>> set([1, 2]) | set(range(3, 5)) | set(i for i in range(1, 5) if i % 2 == 1)
set([1, 2, 3, 4])

Here, we explicitly convert all the iterables to sets and then we find the union. There are multiple intermediate sets and unions. In this case, set.update serves as a good helper function. Since it accepts any iterable, you can simply do

>>> a.update([1, 2], range(3, 5), (i for i in range(1, 5) if i % 2 == 1))
>>> a
set([1, 2, 3, 4])

Solution 2:

add is faster for a single element because it is exactly for that purpose, adding a single element:

In [5]: timeit a.update([1])
10000000 loops, best of 3: 191 ns per loop

In [6]: timeit a.add(1) 
10000000 loops, best of 3: 69.9 ns per loop

update expects an iterable or iterables so if you have a single hashable element to add then use add if you have an iterable or iterables of hashable elements to add use update.

s.add(x) add element x to set s

s.update(t) s |= t return set s with elements added from t