Why don't modern compilers coalesce neighboring memory accesses?

If buf[0] is nonzero, the code will not access buf[1]. So the function should return false without checking the other buf elements. If buf is close to the end of the last memory page, buf[1] may trigger an access fault. The compiler should be very careful to not read stuff which may be forbidden to read.


The first thing to understand is that f(const char buf[4]) does not guarantee that the pointer points to 4 elements, it means exactly the same as const char *buf, the 4 is completely ignored by the language. (C99 has a solution to this, but it's not supported in C++, more on that below)

Given AllZeroes(memset(malloc(1),~0,1)), the implementation

bool AllZeroes(const char buf[4])
{
    return buf[0] == 0 &&
           buf[1] == 0 &&
           buf[2] == 0 &&
           buf[3] == 0;
}

should work, because it never tries to read byte #2 (which doesn't exist) when it notices that byte #1 is non-zero, while the implementation

bool AllZeroes(const int32_t *buf)
{
    return (*buf == 0);
}

should segfault as it tries to read the first 4 bytes while only 1 byte exists (malloced 1 byte only)

FWIW Clang gets it right (and GCC doesn't) in C99 with the implementation

_Bool AllZeroes(const char buf[static 4])
{
    return buf[0] == 0 &
           buf[1] == 0 &
           buf[2] == 0 &
           buf[3] == 0;
}

which compiles to the same as

_Bool AllZeroes(const int32_t *buf)
{
    return (*buf == 0);
}

see https://godbolt.org/z/Grqs3En3K (thanks to Caze @libera #C for finding that)

  • unfortunately buf[static 4], which in C99 is a guarantee-from-the-programmer-to-the-compiler that the pointer points to minimum 4 elements, is not supported in C++