Why do constant expressions have an exclusion for undefined behavior?

I was researching what is allowed in a core constant expression*, which is covered in section 5.19 Constant expressions paragraph 2 of the draft C++ standard which says:

A conditional-expression is a core constant expression unless it involves one of the following as a potentially evaluated subexpression (3.2), but subexpressions of logical AND (5.14), logical OR (5.15), and conditional (5.16) operations that are not evaluated are not considered [ Note: An overloaded operator invokes a function.—end note ]:

and lists out the exclusions in the bullets that follows and includes (emphasis mine):

an operation that would have undefined behavior [ Note: including, for example, signed integer overflow (Clause 5), certain pointer arithmetic (5.7), division by zero (5.6), or certain shift operations (5.8) —end note ];

Huh? Why do constant expressions need this clause to cover undefined behavior? Is there something special about constant expressions that requires undefined behavior to have a special carve out in the exclusions?

Does having this clause give us any advantages or tools we would not have without it?

For reference, this looks like the last revision of the proposal for Generalized Constant Expressions.


Solution 1:

The wording is actually the subject of defect report #1313 which says:

The requirements for constant expressions do not currently, but should, exclude expressions that have undefined behavior, such as pointer arithmetic when the pointers do not point to elements of the same array.

The resolution being the current wording we have now, so this clearly was intended, so what tools does this give us?

Let's see what happens when we try to create a constexpr variable with an expression that contains undefined behavior, we will use clang for all the following examples. This code (see it live):

constexpr int x = std::numeric_limits<int>::max() + 1 ;

produces the following error:

error: constexpr variable 'x' must be initialized by a constant expression
    constexpr int x = std::numeric_limits<int>::max() + 1 ;
                  ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: value 2147483648 is outside the range of representable values of type 'int'
    constexpr int x = std::numeric_limits<int>::max() + 1 ;
                                       ^

This code (see it live):

constexpr int x = 1 << 33 ;  // Assuming 32-bit int

produces this error:

error: constexpr variable 'x' must be initialized by a constant expression
    constexpr int x = 1 << 33 ;  // Assuming 32-bit int
             ^   ~~~~~~~
note: shift count 33 >= width of type 'int' (32 bits)
    constexpr int x = 1 << 33 ;  // Assuming 32-bit int
                  ^

and this code which has undefined behavior in a constexpr function:

constexpr const char *str = "Hello World" ;      

constexpr char access( int index )
{
    return str[index] ;
}

int main()
{
    constexpr char ch = access( 20 ) ;
}

produces this error:

error: constexpr variable 'ch' must be initialized by a constant expression
    constexpr char ch = access( 20 ) ;
                   ^    ~~~~~~~~~~~~

 note: cannot refer to element 20 of array of 12 elements in a constant expression
    return str[index] ;
           ^

Well that is useful the compiler can detect undefined behavior in constexpr, or at least what clang believes is undefined. Note, gcc behaves the same except in the case of undefined behavior with right and left shift, gcc will usually produce a warning in these cases but still sees the expression as constant.

We can use this functionality via SFINAE to detect whether an addition expression would cause overflow, the following contrived example was inspired by dyp's clever answer here:

#include <iostream>
#include <limits>

template <typename T1, typename T2>
struct addIsDefined
{
     template <T1 t1, T2 t2>
     static constexpr bool isDefined()
     {
         return isDefinedHelper<t1,t2>(0) ;
     }

     template <T1 t1, T2 t2, decltype( t1 + t2 ) result = t1+t2>
     static constexpr bool isDefinedHelper(int)
     {
         return true ;
     }

     template <T1 t1, T2 t2>
     static constexpr bool isDefinedHelper(...)
     {
         return false ;
     }
};


int main()
{    
    std::cout << std::boolalpha <<
      addIsDefined<int,int>::isDefined<10,10>() << std::endl ;
    std::cout << std::boolalpha <<
     addIsDefined<int,int>::isDefined<std::numeric_limits<int>::max(),1>() << std::endl ;
    std::cout << std::boolalpha <<
      addIsDefined<unsigned int,unsigned int>::isDefined<std::numeric_limits<unsigned int>::max(),std::numeric_limits<unsigned int>::max()>() << std::endl ;
}

which results in (see it live):

true
false
true

It is not evident that the standard requires this behavior but apparently this comment by Howard Hinnant indicates it indeed is:

[...] and is also constexpr, meaning UB is caught at compile time

Update

Somehow I missed Issue 695 Compile-time calculation errors in constexpr functions which revolves over the wording of section 5 paragraph 4 which used to say (emphasis mine going forward):

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined, unless such an expression appears where an integral constant expression is required (5.19 [expr.const]), in which case the program is ill-formed.

and goes on to say:

intended as an acceptable Standardese circumlocution for “evaluated at compile time,” a concept that is not directly defined by the Standard. It is not clear that this formulation adequately covers constexpr functions.

and a later note says:

[...]There is a tension between wanting to diagnose errors at compile time versus not diagnosing errors that will not actually occur at runtime.[...]The consensus of the CWG was that an expression like 1/0 should simply be considered non-constant; any diagnostic would result from the use of the expression in a context requiring a constant expression.

which if I am reading correctly confirms the intention was to be able to diagnose undefined behavior at compile time in the context requiring a constant expression.

We can not definitely say this was the intent but is does strongly suggest it was. The difference in how clang and gcc treat undefined shifts does leave some room for doubt.

I filed a gcc bug report: Right and left shift undefined behavior not an error in a constexpr. Although it seems like this is conforming, it does break SFINAE and we can see from my answer to Is it a conforming compiler extension to treat non-constexpr standard library functions as constexpr? that divergence in implementation observable to SFINAE users seems undesirable to the committee.

Solution 2:

When we talk about undefined behavior, it is important to remember that the Standard leaves the behavior undefined for these cases. It does not prohibit implementations from making stronger guarantees. For example some implementations may guarantee that signed integer overflow wraps around, while others may guarantee saturation.

Requiring compilers to process constant expressions involving undefined behavior would limit the guarantees that an implementation could make, restricting them to producing some value without side effects (what the Standard calls indeterminate value). That excludes a lot of the extended guarantees found in the real world.

For example, some implementation or companion standard (i.e. POSIX) may define the behavior of integral division by zero to generate a signal. That's a side effect which would be lost if the expression were calculated at compile-time instead.

So, these expressions are rejected at compile-time to avoid loss of side effects in the execution environment.

Solution 3:

There is another point to excluding undefined behavior from constant expressions: constant expressions should, by definition, be evaluated by the compiler at compile time. Allowing a constant expression to invoke undefined behavior would allow the compiler itself to show undefined behavior. And a compiler that formats your hard-drive because you compile some evil code is not something you want to have.