Does not evaluating the expression to which sizeof is applied make it legal to dereference a null or invalid pointer inside sizeof in C++?

I believe this is currently underspecified in the standard, like many issues such as What is the value category of the operands of C++ operators when unspecified?. I don't think it was intentional, like hvd points outs it is probably obvious to the committee.

In this specific case I think we have the evidence to show what the intention was. From GB 91 comment from the Rapperswil meeting which says:

It is mildly distasteful to dereference a null pointer as part of our specification, as we are playing on the edges of undefined behaviour. With the addition of the declval function template, already used in these same expressions, this is no longer necessary.

and suggested an alternate expression, it refers to this expression which is no longer in the standard but can be found in N3090:

noexcept(*(U*)0 = declval<U>())

The suggestion was rejected since this does not invoke undefined behavior since it is unevaluated:

There is no undefined behavior because the expression is an unevaluated operand. It's not at all clear that the proposed change would be clearer.

This rationale applies to sizeof as well since it's operands are unevaluated.

I say underspecified but I wonder if this is covered by section 4.1 [conv.lval] which says:

The value contained in the object indicated by the lvalue is the rvalue result. When an lvalue-to-rvalue conversion occurs within the operand of sizeof (5.3.3) the value contained in the referenced object is not accessed, since that operator does not evaluate its operand.

It says the value contained is not accessed, which if we follow the logic of issue 232 means there is no undefined behavior:

In other words, it is only the act of "fetching", of lvalue-to-rvalue conversion, that triggers the ill-formed or undefined behavior

This is somewhat speculative since the issue is not settled yet.


Since you explicitly asked for standard references - [expr.sizeof]/1:

The operand is either an expression, which is an unevaluated operand (Clause 5), or a parenthesized type-id.

[expr]/8:

In some contexts, unevaluated operands appear (5.2.8, 5.3.3, 5.3.7, 7.1.6.2). An unevaluated operand is not evaluated.

Because the expression (i.e. the dereferenciation) is never evaluated, this expression is not subject to some constraints that it would normally be violating. Solely the type is inspected. In fact, the standard uses null references itself in an example in [dcl.fct]/12:

A trailing-return-type is most useful for a type that would be more complicated to specify before the declarator-id:

template <class T, class U> auto add(T t, U u) -> decltype(t + u);

rather than

template <class T, class U> decltype((*(T*)0) + (*(U*)0)) add(T t, U u);

— end note ]


The specification only says that dereferencing some pointer that is NULL is UB. Since sizeof() is not a real function, and it doesn't actually use the arguments for anything other than getting the type, it never references the pointer. That's WHY it works. Someone else can get all the points for looking up the spec wording that states that "the argument to sizeof doesn't get referenced".

Note that it's also entirely legal to do int arr[2]; size_t s = sizeof(arr[-111100000]); too - it doesn't matter what the index is, because sizeof never actually "does anything" to the argument passed.

Another example to show how it's "not doing anything" would be something like this:

int func()
{
    int *ptr = reinterpret_cast<int*>(32);
    *ptr = 7;
    return 42;
}

size_t size = sizeof(func()); 

Again, this wouldn't crash, because func() is just resolved by the compiler to the type that it produces.

Equally, if sizeof actually "does something" with the argument, what would happen when you do this:

   char *buffer = new sizeof(char[10000000000]);

Would it create a 10000000000 stack allocation, then give the size back after it crashed the code because there isn't enough megabytes of stack? [In some systems, stack size is counted in bytes, not megabytes]. And whilst nobody writes code like that, you could easily come up with something similar using typedef of either buffer_type as an array of char, or some kind of struct with large content.