How can I make a python dataclass hashable without making them immutable?

Solution 1:

From the docs:

Here are the rules governing implicit creation of a __hash__() method:

[...]

If eq and frozen are both true, by default dataclass() will generate a __hash__() method for you. If eq is true and frozen is false, __hash__() will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false, __hash__() will be left untouched meaning the __hash__() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).

Since you set eq=True and left frozen at the default (False), your dataclass is unhashable.

You have 3 options:

  • Set frozen=True (in addition to eq=True), which will make your class immutable and hashable.
  • Set unsafe_hash=True, which will create a __hash__ method but leave your class mutable, thus risking problems if an instance of your class is modified while stored in a dict or set:

    cat = Category('foo', 'bar')
    categories = {cat}
    cat.id = 'baz'
    
    print(cat in categories)  # False
    
  • Manually implement a __hash__ method.

Solution 2:

TL;DR

Use frozen=True in conjunction to eq=True (which will make the instances immutable).

Long Answer

From the docs:

__hash__() is used by built-in hash(), and when objects are added to hashed collections such as dictionaries and sets. Having a __hash__() implies that instances of the class are immutable. Mutability is a complicated property that depends on the programmer’s intent, the existence and behavior of __eq__(), and the values of the eq and frozen flags in the dataclass() decorator.

By default, dataclass() will not implicitly add a __hash__() method unless it is safe to do so. Neither will it add or change an existing explicitly defined __hash__() method. Setting the class attribute __hash__ = None has a specific meaning to Python, as described in the __hash__()documentation.

If __hash__() is not explicit defined, or if it is set to None, then dataclass() may add an implicit __hash__() method. Although not recommended, you can force dataclass() to create a __hash__() method with unsafe_hash=True. This might be the case if your class is logically immutable but can nonetheless be mutated. This is a specialized use case and should be considered carefully.

Here are the rules governing implicit creation of a __hash__() method. Note that you cannot both have an explicit __hash__() method in your dataclass and set unsafe_hash=True; this will result in a TypeError.

If eq and frozen are both true, by default dataclass() will generate a __hash__() method for you. If eq is true and frozen is false, __hash__() will be set to None, marking it unhashable (which it is, since it is mutable). If eq is false, __hash__() will be left untouched meaning the __hash__() method of the superclass will be used (if the superclass is object, this means it will fall back to id-based hashing).