Converting a list to a set changes element order
Solution 1:
A
set
is an unordered data structure, so it does not preserve the insertion order.-
This depends on your requirements. If you have an normal list, and want to remove some set of elements while preserving the order of the list, you can do this with a list comprehension:
>>> a = [1, 2, 20, 6, 210] >>> b = set([6, 20, 1]) >>> [x for x in a if x not in b] [2, 210]
If you need a data structure that supports both fast membership tests and preservation of insertion order, you can use the keys of a Python dictionary, which starting from Python 3.7 is guaranteed to preserve the insertion order:
>>> a = dict.fromkeys([1, 2, 20, 6, 210]) >>> b = dict.fromkeys([6, 20, 1]) >>> dict.fromkeys(x for x in a if x not in b) {2: None, 210: None}
b
doesn't really need to be ordered here – you could use aset
as well. Note thata.keys() - b.keys()
returns the set difference as aset
, so it won't preserve the insertion order.In older versions of Python, you can use
collections.OrderedDict
instead:>>> a = collections.OrderedDict.fromkeys([1, 2, 20, 6, 210]) >>> b = collections.OrderedDict.fromkeys([6, 20, 1]) >>> collections.OrderedDict.fromkeys(x for x in a if x not in b) OrderedDict([(2, None), (210, None)])
Solution 2:
In Python 3.6, there is another solution for Python 2 and 3:set()
now should keep the order, but
>>> x = [1, 2, 20, 6, 210]
>>> sorted(set(x), key=x.index)
[1, 2, 20, 6, 210]
Solution 3:
Answering your first question, a set is a data structure optimized for set operations. Like a mathematical set, it does not enforce or maintain any particular order of the elements. The abstract concept of a set does not enforce order, so the implementation is not required to. When you create a set from a list, Python has the liberty to change the order of the elements for the needs of the internal implementation it uses for a set, which is able to perform set operations efficiently.
Solution 4:
Remove duplicates and preserve order by below function
def unique(sequence):
seen = set()
return [x for x in sequence if not (x in seen or seen.add(x))]
How to remove duplicates from a list while preserving order in Python
Solution 5:
In mathematics, there are sets and ordered sets (osets).
- set: an unordered container of unique elements (Implemented)
- oset: an ordered container of unique elements (NotImplemented)
In Python, only sets are directly implemented. We can emulate osets with regular dict keys (3.7+).
Given
a = [1, 2, 20, 6, 210, 2, 1]
b = {2, 6}
Code
oset = dict.fromkeys(a).keys()
# dict_keys([1, 2, 20, 6, 210])
Demo
Replicates are removed, insertion-order is preserved.
list(oset)
# [1, 2, 20, 6, 210]
Set-like operations on dict keys.
oset - b
# {1, 20, 210}
oset | b
# {1, 2, 5, 6, 20, 210}
oset & b
# {2, 6}
oset ^ b
# {1, 5, 20, 210}
Details
Note: an unordered structure does not preclude ordered elements. Rather, maintained order is not guaranteed. Example:
assert {1, 2, 3} == {2, 3, 1} # sets (order is ignored)
assert [1, 2, 3] != [2, 3, 1] # lists (order is guaranteed)
One may be pleased to discover that a list and multiset (mset) are two more fascinating, mathematical data structures:
- list: an ordered container of elements that permits replicates (Implemented)
- mset: an unordered container of elements that permits replicates (NotImplemented)*
Summary
Container | Ordered | Unique | Implemented
----------|---------|--------|------------
set | n | y | y
oset | y | y | n
list | y | n | y
mset | n | n | n*
*A multiset can be indirectly emulated with collections.Counter()
, a dict-like mapping of multiplicities (counts).