Why does the enhanced GCC 6 optimizer break practical C++ code?

Solution 1:

I guess the question that needs to be answered why well-intentioned people would write the checks in the first place.

The most common case is probably if you have a class that is part of a naturally occurring recursive call.

If you had:

struct Node
    Node* left;
    Node* right;

in C, you might write:

void traverse_in_order(Node* n) {
    if(!n) return;

In C++, it's nice to make this a member function:

void Node::traverse_in_order() {
    // <--- What check should be put here?

In the early days of C++ (prior to standardization), it was emphasized that that member functions were syntactic sugar for a function where the this parameter is implicit. Code was written in C++, converted to equivalent C and compiled. There were even explicit examples that comparing this to null was meaningful and the original Cfront compiler took advantage of this too. So coming from a C background, the obvious choice for the check is:

if(this == nullptr) return;      

Note: Bjarne Stroustrup even mentions that the rules for this have changed over the years here

And this worked on many compilers for many years. When standardization happened, this changed. And more recently, compilers started taking advantage of calling a member function where this being nullptr is undefined behavior, which means that this condition is always false, and the compiler is free to omit it.

That means that to do any traversal of this tree, you need to either:

  • Do all of the checks before calling traverse_in_order

    void Node::traverse_in_order() {
        if(left) left->traverse_in_order();
        if(right) right->traverse_in_order();

    This means also checking at EVERY call site if you could have a null root.

  • Don't use a member function

    This means that you're writing the old C style code (perhaps as a static method), and calling it with the object explicitly as a parameter. eg. you're back to writing Node::traverse_in_order(node); rather than node->traverse_in_order(); at the call site.

  • I believe the easiest/neatest way to fix this particular example in a way that is standards compliant is to actually use a sentinel node rather than a nullptr.

    // static class, or global variable
    Node sentinel;
    void Node::traverse_in_order() {
        if(this == &sentinel) return;

Neither of the first two options seem that appealing, and while code could get away with it, they wrote bad code with this == nullptr instead of using a proper fix.

I'm guessing that's how some of these code bases evolved to have this == nullptr checks in them.

Solution 2:

It does so because the "practical" code was broken and involved undefined behavior to begin with. There's no reason to use a null this, other than as a micro-optimization, usually a very premature one.

It's a dangerous practice, since adjustment of pointers due to class hierarchy traversal can turn a null this into a non-null one. So, at the very least, the class whose methods are supposed to work with a null this must be a final class with no base class: it can't derive from anything, and it can't be derived from. We're quickly departing from practical to ugly-hack-land.

In practical terms, the code doesn't have to be ugly:

struct Node
  Node* left;
  Node* right;
  void process();
  void traverse_in_order() {
  static void traverse_in_order_impl(Node * n)
    if (!n) return;

If you had an empty tree (eg. root is nullptr), this solution is still relying on undefined behavior by calling traverse_in_order with a nullptr.

If the tree is empty, a.k.a. a null Node* root, you aren't supposed to be calling any non-static methods on it. Period. It's perfectly fine to have C-like tree code that takes an instance pointer by an explicit parameter.

The argument here seems to boil down to somehow needing to write non-static methods on objects that could be called from a null instance pointer. There's no such need. The C-with-objects way of writing such code is still way nicer in the C++ world, because it can be type safe at the very least. Basically, the null this is such a micro-optimization, with such narrow field of use, that disallowing it is IMHO perfectly fine. No public API should depend on a null this.

Solution 3:

The change document clearly calls this out as dangerous because it breaks a surprising amount of frequently used code.

The document doesn't call it dangerous. Nor does it claim that it breaks a surprising amount of code. It simply points out a few popular code bases which it claims to be known to rely on this undefined behaviour and would break due to the change unless the workaround option is used.

Why would this new assumption break practical C++ code?

If practical c++ code relies on undefined behaviour, then changes to that undefined behaviour can break it. This is why UB is to be avoided, even when a program relying on it appears to work as intended.

Are there particular patterns where careless or uninformed programmers rely on this particular undefined behavior?

I don't know if it's wide spread anti-pattern, but an uninformed programmer might think that they can fix their program from crashing by doing:

if (this)
    member_variable = 42;

When the actual bug is dereferencing a null pointer somewhere else.

I'm sure that if programmer is uninformed enough, they will be able to come up with more advanced (anti)-patterns that rely on this UB.

I cannot imagine anyone writing if (this == NULL) because that is so unnatural.

I can.

Solution 4:

Some of the "practical" (funny way to spell "buggy") code that was broken looked like this:

void foo(X* p) {

and it forgot to account for the fact that p->bar() sometimes returns a null pointer, which means that dereferencing it to call baz() is undefined.

Not all the code that was broken contained explicit if (this == nullptr) or if (!p) return; checks. Some cases were simply functions that didn't access any member variables, and so appeared to work OK. For example:

struct DummyImpl {
  bool valid() const { return false; }
  int m_data;
struct RealImpl {
  bool valid() const { return m_valid; }
  bool m_valid;
  int m_data;

template<typename T>
void do_something_else(T* p) {
  if (p) {

template<typename T>
void func(T* p) {
  if (p->valid())

In this code when you call func<DummyImpl*>(DummyImpl*) with a null pointer there is a "conceptual" dereference of the pointer to call p->DummyImpl::valid(), but in fact that member function just returns false without accessing *this. That return false can be inlined and so in practice the pointer doesn't need to be accessed at all. So with some compilers it appears to work OK: there's no segfault for dereferencing null, p->valid() is false, so the code calls do_something_else(p), which checks for null pointers, and so does nothing. No crash or unexpected behaviour is observed.

With GCC 6 you still get the call to p->valid(), but the compiler now infers from that expression that p must be non-null (otherwise p->valid() would be undefined behaviour) and makes a note of that information. That inferred information is used by the optimizer so that if the call to do_something_else(p) gets inlined, the if (p) check is now considered redundant, because the compiler remembers that it is not null, and so inlines the code to:

template<typename T>
void func(T* p) {
  if (p->valid())
  else {
    // inlined body of do_something_else(p) with value propagation
    // optimization performed to remove null check.

This now really does dereference a null pointer, and so code that previously appeared to work stops working.

In this example the bug is in func, which should have checked for null first (or the callers should never have called it with null):

template<typename T>
void func(T* p) {
  if (p && p->valid())

An important point to remember is that most optimizations like this are not a case of the compiler saying "ah, the programmer tested this pointer against null, I will remove it just to be annoying". What happens is that various run-of-the-mill optimizations like inlining and value range propagation combine to make those checks redundant, because they come after an earlier check, or a dereference. If the compiler knows that a pointer is non-null at point A in a function, and the pointer isn't changed before a later point B in the same function, then it knows it is also non-null at B. When inlining happens points A and B might actually be pieces of code that were originally in separate functions, but are now combined into one piece of code, and the compiler is able to apply its knowledge that the pointer is non-null in more places. This is a basic, but very important optimization, and if compilers didn't do that everyday code would be considerably slower and people would complain about unnecessary branches to re-test the same conditions repeatedly.