C++ - Why is boost::hash_combine the best way to combine hash-values?

I've read in other posts that this seems to be the best way to combine hash-values. Could somebody please break this down and explain why this is the best way to do it?

template <class T>
inline void hash_combine(std::size_t& seed, const T& v)
{
    std::hash<T> hasher;
    seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);
}

Edit: The other question is only asking for the magic number, but I'd like to get know about the whole function, not only this part.


It being the "best" is argumentative.

It being "good", or even "very good", at least superficially, is easy.

seed ^= hasher(v) + 0x9e3779b9 + (seed<<6) + (seed>>2);

We'll presume seed is a previous result of hasher or this algorithm.

^= means that the bits on the left and bits on the right all change the bits of the result.

hasher(v) is presumed to be a decent hash on v. But the rest is defence in case it isn't a decent hash.

0x9e3779b9 is a 32 bit value (it could be extended to 64 bit if size_t was 64 bit arguably) that contains half 0s and half 1s. It is basically a random series of 0s and 1s done by approximating particular irrational constant as a base-2 fixed point value. This helps ensure that if the hasher returns bad values, we still get a smear of 1s and 0s in our output.

(seed<<6) + (seed>>2) is a bit shuffle of the incoming seed.

Imagine the 0x constant was missing. Imagine the hasher returns the constant 0x01000 for almost every v passed in. Now, each bit of the seed is spread out over the next iteration of the hash, during which it is again spread out.

The seed ^= (seed<<6) + (seed>>2) 0x00001000 becomes 0x00041400 after one iteration. Then 0x00859500. As you repeat the operation, any set bits are "smeared out" over the output bits. Eventually the right and left bits collide, and carry moves the set bit from "even locations" to "odd locations".

The bits dependent on the value of an input seed grows relatively fast and in complex ways as the combine operation recurses on the seed operation. Adding causes carries, which smear things even more. The 0x constant adds a bunch of pseudo-random bits that make boring hash values occupy more than a few bits of the hash space after being combined.

It is asymmetric thanks to addition (combining the hashes of "dog" and "god" gives different results), it handles boring hash values (mapping characters to their ascii value, which only involves twiddling a handful of bits). And, it is reasonably fast.

Slower hash combines that are cryptographically strong can be better in other situations. I, naively, would presume that making the shifts be a combination of even and odd shifts might be a good idea (but maybe addition, which moves even bits from odd bits, makes that less of a problem: after 3 iterations, incoming lone seed bits will collide and add and cause a carry).

The downside to this kind of analysis is that it only takes one mistake to make a hash function really bad. Pointing out all the good things doesn't help that much. So another thing that makes it good now is that it is reasonably famous and in an open-source repository, and I haven't heard anyone point out why it is bad.


It's not the best, surprisingly to me it's not even particularily good. The main problem is the bad distribution, which is not really the fault of boost::hash_combine in itself, but in conjunction with a badly distributing hash like std::hash which is most commonly implemented with the identity function.

boost entropy matrix Figure 2: The effect of a single bit change in one of two random 32 bit numbers on the result of boost::hash_combine . On the x-axis are the input bits (two times 32, first the new hash then the old seed), on the y-axis are the output bits. The color indicate the degree of dependence.

To demonstrate how bad things can become these are the collisions for points (x,y) on a 32x32 grid when using hash_combine as intended, and with std::hash:

# hash_combine(hash_combine(0,x₀),y₀)=hash_combine(hash_combine(0,x₁),y₁)
# hash      x₀   y₀  x₁  y₁
3449074105  6   30   8  15
3449074104  6   31   8  16
3449074107  6   28   8  17
3449074106  6   29   8  18
3449074109  6   26   8  19
3449074108  6   27   8  20
3449074111  6   24   8  21
3449074110  6   25   8  22

For a well distributed hash there should be none, statistically. One could make a hash_combine that cascades more (for example by using multiple more spread out xor-shifts) and preserves the entropy better (for example using bit-rotations instead of bit-shifts). But really what you should do is use a good hash function in the first place, then after that a simple xor is sufficient to combine the seed and the hash, if the hash encodes the position in the sequence. For ease of implementation the following hash does not encode the position. To make hash_combine non commutative any non-commutative and bijective operation is sufficient. I chose an asymmetric binary rotation because it is cheap.

#include <limits>
#include <cstdint>

template<typename T>
T xorshift(const T& n,int i){
  return n^(n>>i);
}

// a hash function with another name as to not confuse with std::hash
uint32_t distribute(const uint32_t& n){
  uint32_t p = 0x55555555ul; // pattern of alternating 0 and 1
  uint32_t c = 3423571495ul; // random uneven integer constant; 
  return c*xorshift(p*xorshift(n,16),16);
}

// a hash function with another name as to not confuse with std::hash
uint64_t distribute(const uint64_t& n){
  uint64_t p = 0x5555555555555555ull; // pattern of alternating 0 and 1
  uint64_t c = 17316035218449499591ull;// random uneven integer constant; 
  return c*xorshift(p*xorshift(n,32),32);
}

// if c++20 rotl is not available:
template <typename T,typename S>
typename std::enable_if<std::is_unsigned<T>::value,T>::type
constexpr rotl(const T n, const S i){
  const T m = (std::numeric_limits<T>::digits-1);
  const T c = i&m;
  return (n<<c)|(n>>((T(0)-c)&m)); // this is usually recognized by the compiler to mean rotation, also c++20 now gives us rotl directly
}

// call this function with the old seed and the new key to be hashed and combined into the new seed value, respectively the final hash
template <class T>
inline size_t hash_combine(std::size_t& seed, const T& v)
{
    return rotl(seed,std::numeric_limits<size_t>::digits/3) ^ distribute(std::hash<T>{}(v));
}

The seed is rotated once before combining it to make the order in which the hash was computed relevant.

The hash_combine from boost needs two operations less, and more importantly no multiplications, in fact it's about 5x faster, but at about 2 cyles per hash on my machine the proposed solution is still very fast and pays off quickly when used for a hash table. There are 118 collisions on a 1024x1024 grid (vs. 982017 for boosts hash_combine + std::hash), about as many as expected for a well distributed hash function and that is all we can ask for.

Now even when used in conjunction with a good hash function boost::hash_combine is not ideal. If all entropy is in the seed at some point some of it will get lost. There are 2948667289 distinct results of boost::hash_combine(x,0), but there should be 4294967296 .

In conclusion, they tried to create a hash function that does both, combining and cascading, and fast, but ended up with something that does both just good enough to not be recognised as bad immediately. But fast it is.


ROTL For VS studio (you can deduce ROTR easily). (This is actually in reply to @WolfgangBrehm.)

Reason: the standard trick to induce the compiler to emit ror and/or rol instructions gives an error in VS: error C4146: unary minus operator applied to unsigned type, result still unsigned.

So... I solved the compiler error by replacing (-c) by (T(0) -c) but that is not going to be optimized.

Adding (MS specific) specialisations solves that as inspection of the emitted optimised assembly will show.

#include <intrin.h>              // and some more includes, see above...

template <typename T>            // default template is not good for optimisation
typename std::enable_if<std::is_unsigned<T>::value, T>::type
constexpr rotl(const T n, const int i)
{
    constexpr T m = (std::numeric_limits<T>::digits - 1);
    const T c = i & m;
    //return (n << c) | (n >> (-c) & m);
    return (n << c) | (n >> (T(0) - c) & m);
}
template<>
inline uint32_t rotl(const uint32_t n, const int i)
{
    constexpr int m = (std::numeric_limits<uint32_t>::digits - 1);
    const int c = i & m;
    return _rotl(n, c);
}
template<>
inline uchar rotl(const uchar n, const int i)
{
    constexpr uchar m = (std::numeric_limits<uchar>::digits - 1);
    const uchar c = i & m;
    return _rotl8(n, c);
}
template<>
inline ushort rotl(const ushort n, const int i)
{
    constexpr uchar m = (std::numeric_limits<ushort>::digits - 1);
    const uchar c = i & m;
    return _rotl16(n, c);
}
template<>
inline uint64_t rotl(const uint64_t n, const int i)
{
    constexpr int m = (std::numeric_limits<uint64_t>::digits - 1);
    const int c = i & m;
    return _rotl64(n, c);
}