List unhashable, but tuple hashable?
Mainly, because tuples are immutable. Assume the following works:
>>> l = [1, 2, 3]
>>> t = (1, 2, 3)
>>> x = {l: 'a list', t: 'a tuple'}
Now, what happens when you do l.append(4)
? You've modified the key in your dictionary! From afar! If you're familiar with how hashing algorithms work, this should frighten you. Tuples, on the other hand, are absolutely immutable. t += (1,)
might look like it's modifying the tuple, but really it's not: it simply creating a new tuple, leaving your dictionary key unchanged.
You could totally make that work, but I bet you wouldn't like the effects.
from functools import reduce
from operator import xor
class List(list):
def __hash__(self):
return reduce(xor, self)
Now let's see what happens:
>>> l = List([23,42,99])
>>> hash(l)
94
>>> d = {l: "Hello"}
>>> d[l]
'Hello'
>>> l.append(7)
>>> d
{[23, 42, 99, 7]: 'Hello'}
>>> l
[23, 42, 99, 7]
>>> d[l]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: [23, 42, 99, 7]
edit: So I thought about this some more. You could make the above example work, if you return the list's id as its hash value:
class List(list):
def __hash__(self):
return id(self)
In that case, d[l]
will give you 'Hello'
, but neither d[[23,42,99,7]]
nor d[List([23,42,99,7])]
will (because you're creating a new [Ll]ist
.
Since a list is mutable, if you modify it you would modify its hash too, which ruins the point of having a hash (like in a set or a dict key).
Edit: I'm surprised this answer regularly get new upvotes, it was really quickly written. I feel I need to make it better now.
So the set and the dict native data structures are implemented with a hashmap. Data types in Python may have a magic method __hash__() that will be used in hashmap construction and lookups.
Only immutable data types (int, string, tuple, ...) have this method, and the hash value is based on the data and not the identity of the object. You can check this by
>>> a = (0,1)
>>> b = (0,1)
>>> a is b
False # Different objects
>>> hash(a) == hash(b)
True # Same hash
If we follow this logic, mutating the data would mutate the hash, but then what's the point of a changing hash ? It defeats the whole purpose of sets and dicts or other hashes usages.
Fun fact : if you try the example with strings or ints -5 <= i <= 256, a is b
returns True because of micro-optimizations (in CPython at least).