Why does this implementation of strlen() work?
Solution 1:
Although this is technically undefined behavior, in practice no native architecture checks for out-of-bounds memory access at a finer granularity than the size of a word. So while garbage past the terminator may end up being read, the result will not be a crash.
Solution 2:
I don't see at all why alignment would be any relevant if the array is not long enough and we are reading past its end.
The routine starts with aligning to a word boundary for two reasons: first, reading words from an aligned address is faster on most architectures (and it's also mandatory on a few CPUs). The speed increase is enough to use the same trick in a host of similar operations: memcpy, strcpy, memmove, memchr, etc.
Second: if you continue reading words starting at a word boundary, you are assured the rest of the string resides in the same memory page. A string (including its terminating zero) cannot straddle a memory page boundary, and neither can reading a word. (1)
So this is always fastest and safest, even if the memory page granularity is sizeof(LONG_BIT) (which it isn't).
Picking up an entire word near the end of a string may pick up additional bytes after the final zero, but reading Undefined Bytes from valid memory is not UB -- only acting upon its contents is (2). If the word contains a zero terminator anywhere inside, the individual bytes are inspected with test_byte
, and this, as is shown in the original source, will never act on bytes after the terminator.
(1) Obviously they can, but I meant "never into a locked page" or something similar.
(2) Under Discussion. See (sorry about that!) under Sneftel's answer.