Set difference versus set subtraction
What distinguishes -
and .difference()
on sets? Obviously the syntax is not the same, one is a binary operator, the other is an instance method. What else?
s1 = set([1,2,3])
s2 = set([3,4,5])
>>> s1 - s2
set([1, 2])
>>> s1.difference(s2)
set([1, 2])
set.difference, set.union...
can take any iterable as the second arg while both need to be sets to use -
, there is no difference in the output.
Operation Equivalent Result
s.difference(t) s - t new set with elements in s but not in t
With .difference you can do things like:
s1 = set([1,2,3])
print(s1.difference(*[[3],[4],[5]]))
{1, 2}
It is also more efficient when creating sets using the *(iterable,iterable)
syntax as you don't create intermediary sets, you can see some comparisons here
On a quick glance it may not be quite evident from the documentation but buried deep inside a paragraph is dedicated to differentiate the method call with the operator version
Note, the non-operator versions of union(), intersection(), difference(), and symmetric_difference(), issubset(), and issuperset() methods will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like
set('abc') & 'cbs'
in favor of the more readableset('abc').intersection('cbs')
.
The documentation appears to suggest that difference can take multiple sets, so it is possible that it might be more efficient and clearer for things like:
s1 = set([1, 2, 3, 4])
s2 = set([2, 5])
s3 = set([3, 6])
s1.difference(s2, s3) # instead of s1 - s2 - s3
but I would suggest some testing to verify.