Compile time string hashing

This is a little bit late, but I succeeded in implementing a compile-time CRC32 function with the use of constexpr. The problem with it is that at the time of writing, it only works with GCC and not MSVC nor Intel compiler.

Here is the code snippet:

// CRC32 Table (zlib polynomial)
static constexpr uint32_t crc_table[256] = {
    0x00000000L, 0x77073096L, 0xee0e612cL, 0x990951baL, 0x076dc419L,
    0x706af48fL, 0xe963a535L, 0x9e6495a3L, 0x0edb8832L, 0x79dcb8a4L,
    0xe0d5e91eL, 0x97d2d988L, 0x09b64c2bL, 0x7eb17cbdL, 0xe7b82d07L,
...
};
template<size_t idx>
constexpr uint32_t crc32(const char * str)
{
    return (crc32<idx-1>(str) >> 8) ^ crc_table[(crc32<idx-1>(str) ^ str[idx]) & 0x000000FF];
}

// This is the stop-recursion function
template<>
constexpr uint32_t crc32<size_t(-1)>(const char * str)
{
    return 0xFFFFFFFF;
}

// This doesn't take into account the nul char
#define COMPILE_TIME_CRC32_STR(x) (crc32<sizeof(x) - 2>(x) ^ 0xFFFFFFFF)

enum TestEnum
{
    CrcVal01 = COMPILE_TIME_CRC32_STR("stack-overflow"),
};

CrcVal01 is equal to 0x335CC04A

Hope this will help you!


At least by my reading of §7.1.5/3 and §5.19, the following might be legitimate:

unsigned constexpr const_hash(char const *input) {
    return *input ?
        static_cast<unsigned int>(*input) + 33 * const_hash(input + 1) :
        5381;
}

This does seem to follow the basic rules in §7.1.5/3:

  1. The form is: "return expression;"
  2. Its only parameter is a pointer, which is a scalar type, and therefore a literal type.
  3. Its return is unsigned int, which is also scalar (and therefore literal).
  4. There is no implicit conversion to the return type.

There is some question whether the *inputs involve an illegal lvalue to rvalue conversion, and I'm not sure I understand the rules in §5.19/2/6/21 and §4.1 well enough to be sure about that.

From a practical viewpoint, this code is accepted by (for one example) g++, at least as far back as g++ 4.7.1.

Usage would be something like:

switch(std::hash(value)) {
    case const_hash("one"): one(); break;
    case const_hash("two"): two(); break;
    // ...
    default: other(); break;
}

To comply with the requirements of §5.19/2/6/2 you might have to do something like this though:

// one of the `constexpr`s is probably redundant, but I haven't figure out which.
char constexpr * constexpr v_one = "one"; 

// ....

case const_hash(v_one): one(); break;
  1. I'm using the extra 'slash' numbers to refer to unnumbered bullet points, so this is the second bullet point inside if the sixth bullet point under §5.19/2. I think I might have to talk to Pete Becker about whether it's possible to add some sort of numbers/letters/roman numerals down the hierarchy to identify pieces like this...