Set difference versus set subtraction

What distinguishes - and .difference() on sets? Obviously the syntax is not the same, one is a binary operator, the other is an instance method. What else?

s1 = set([1,2,3])
s2 = set([3,4,5])

>>> s1 - s2
set([1, 2])
>>> s1.difference(s2)
set([1, 2])

set.difference, set.union... can take any iterable as the second arg while both need to be sets to use -, there is no difference in the output.

Operation         Equivalent   Result
s.difference(t)   s - t        new set with elements in s but not in t

With .difference you can do things like:

s1 = set([1,2,3])

print(s1.difference(*[[3],[4],[5]]))

{1, 2}

It is also more efficient when creating sets using the *(iterable,iterable) syntax as you don't create intermediary sets, you can see some comparisons here

On a quick glance it may not be quite evident from the documentation but buried deep inside a paragraph is dedicated to differentiate the method call with the operator version

Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like set('abc') & 'cbs' in favor of the more readable set('abc').intersection('cbs').

The documentation appears to suggest that difference can take multiple sets, so it is possible that it might be more efficient and clearer for things like:

s1 = set([1, 2, 3, 4])
s2 = set([2, 5])
s3 = set([3, 6])
s1.difference(s2, s3) # instead of s1 - s2 - s3

but I would suggest some testing to verify.

Set difference versus set subtraction

Related

Recent Posts