What makes a user-defined class unhashable?
Simply setting the __hash__
method to that of the tuple
class is not enough. You haven't actually told it how to hash any differently. tuples are hashable because they are immutable. If you really wanted to make you specific example work, it might be like this:
class X2(list):
def __hash__(self):
return hash(tuple(self))
In this case you are actually defining how to hash your custom list subclass. You just have to define exactly how it can generate a hash. You can hash on whatever you want, as opposed to using the tuple's hashing method:
def __hash__(self):
return hash("foobar"*len(self))
From the Python3 docs:
If a class does not define an __eq__() method it should not define a __hash__() operation either; if it defines __eq__() but not __hash__(), its instances will not be usable as items in hashable collections. If a class defines mutable objects and implements an __eq__() method, it should not implement __hash__(), since the implementation of hashable collections requires that a key’s hash value is immutable (if the object’s hash value changes, it will be in the wrong hash bucket).
Ref: object.__hash__(self)
Sample code:
class Hashable:
pass
class Unhashable:
def __eq__(self, other):
return (self == other)
class HashableAgain:
def __eq__(self, other):
return (self == other)
def __hash__(self):
return id(self)
def main():
# OK
print(hash(Hashable()))
# Throws: TypeError("unhashable type: 'X'",)
print(hash(Unhashable()))
# OK
print(hash(HashableAgain()))
What you could and should do, based on your other question, is: don't subclass anything, just encapsulate a tuple. It's perfectly fine to do so in the init.
class X(object):
def __init__(self, *args):
self.tpl = args
def __hash__(self):
return hash(self.tpl)
def __eq__(self, other):
return self.tpl == other
def __repr__(self):
return repr(self.tpl)
x1 = X()
s = {x1}
which yields:
>>> s
set([()])
>>> x1
()
If you don't modify instances of X
after creation, why aren't you subclassing tuple?
But I'll point out that this actually doesn't throw an error, at least in Python 2.6.
>>> class X(list):
... __hash__ = tuple.__hash__
... __eq__ = tuple.__eq__
...
>>> x = X()
>>> s = set((x,))
>>> s
set([[]])
I hesitate to say "works" because this doesn't do what you think it does.
>>> a = X()
>>> b = X((5,))
>>> hash(a)
4299954584
>>> hash(b)
4299954672
>>> id(a)
4299954584
>>> id(b)
4299954672
It's just using the object id as a hash. When you actually call __hash__
you still get an error; likewise for __eq__
.
>>> a.__hash__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__hash__' for 'tuple' objects doesn't apply to 'X' object
>>> X().__eq__(X())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: descriptor '__eq__' for 'tuple' objects doesn't apply to 'X' object
I gather that the python internals, for some reason, are detecting that X
has a __hash__
and an __eq__
method, but aren't calling them.
The moral of all this is: just write a real hash function. Since this is a sequence object, converting it to a tuple and hashing that is the most obvious approach.
def __hash__(self):
return hash(tuple(self))
An addition to the above answers - For the specific case of a dataclass in python3.7+ - to make a dataclass hashable, you can use
@dataclass(frozen=True)
class YourClass:
pass
as the decoration instead of
@dataclass
class YourClass:
pass