How can I make my class immune to the "auto value = copy of proxy" landmine in C++?

I have a fairly complex maths library I'm working on, and I have discovered a nasty bug when client code uses auto. Halfway through creating a minimal reproductive case to ask a question about it, I realise I can reproduce something similar using the standard library alone. See this simple test case:

#include <vector>
#include <assert.h>

int main()
{
    std::vector<bool> allTheData = {true, false, true};

    auto boolValue = allTheData[1]; // This should be false - we just declared it.
    assert(boolValue == false);
    boolValue = !boolValue;
    assert(boolValue == true);

    assert(allTheData[1] == false); // Huh? But we never changed the source data! Only our local copy.
}

Live on Godbolt. (Fun fact: Clang actually optimises this to writing "7" - 3 true bits - and a call to __assert_fail.)

(Yes I know std::vector<bool> sucks - but in this case it's handy to create a minimum reproducible example that's only a few lines long) Here's a longer example that doesn't use std::vector<bool>, and uses a custom container type, with assignment and copy/move deleted, and still shows the problem.

I understand what is going on under the hood, there's a proxy class returned by operator[] meant to implement allTheData[1] = true and related functionality, the client code that is written as if it's reading the value is actually storing the proxy in boolValue, and then when the client later modifies what it thinks is a bool, the original source data is modified instead. TLDR: 'auto' copied the proxy.

The code did what the programmer told it to do, not what the programmer meant.

If the programmer wanted boolValue's changes to update the source data, they would've done auto& boolValue = ..., which works with operator[] implementations returning T&, but not those needing custom proxies which fake reference-like behavior.

All the copy and move constructors and both assignment operators for the proxy are declared private (have also tried = delete), but this bug isn't caught at compile time. The proxy is copied regardless of whether the copy constructor is deleted.

All the "fixes" I've found for this bug focus on the client part of the code. They're things like: "don't use auto", "cast to the underlying type", "access through a const ref", etc. These are all substandard fixes, once you discover the bad behavior you can add one of these as a hack fix, but the underlying problem remains to catch the next unsuspecting user.

I'd rather remove the landmine than keep bypassing it, and putting up a sign saying "don't use auto", or "always use const", just marks the minefield, it doesn't remove it.

How can I make my library immune to this gotcha? (Without changing the client code!)

  • First preference would be the code works as written - assert(allTheData[1] == false) passes
    • A way to define the decay type of the proxy when it's written to auto?. So decltype(boolValue) is bool?
    • An implicit conversion operator that takes precedence over copying?
    • Any other way to make this pass without changing the code snippet above?
  • Second preference is there a way to make writing a proxy to a variable a compile error?
    • I'm declaring copy and move constructors as delete, and move and copy assignment operators as delete. Still compiles.
    • Is there anyway to declare a class as unable to become a lvalue?
  • Is there anything in proposed c++ future standards that will fix this?

Also an issue is code like:

std::vector<bool> ReadFlags();
... later ...
auto databaseIsLockedFlag = ReadFlags()[FLAG_DB_LOCKED];
if (databaseIsLockedFlag) <-- Crash here. Proxy has outlived temporary vector.

I'm only using vector here as it's a really simple example of the problem. This isn't a bug with vector, this is a bug with the proxy type pattern, of which vector is an example to show the problem.

Strangely enough MSVC's Intellisense engine sometimes reports copying a no-move-no-copy proxy type as a compile error, but then compiles it fine anyway:

enter image description here
It'd be really nice if this intellisense compile error was a real compile error. Sigh


Solution 1:

Reduce the damage by adding "&&" to the end of the proxy class's operator=

(And operator +=, -=, etc.)

Took me a lot of experimenting but I eventually found a way to mitigate the most common case of the problem, this tightens it so you can still copy the proxy, but once you've copied it to a stack variable, you can't modify it and inadvertently corrupt the source container.

#include <cstdio>
#include <utility>

auto someComplexMethod()
{
  struct s
  {
    void operator=(int A)&& {std::printf("Setting A to %i", A);}
  };
  return s();
}

int main()
{
  someComplexMethod() = 4; // Compiles. Yay

  auto b = someComplexMethod(); 
  // Unfortunately that still compiles, and it's still taking a 
  // copy of the proxy, but no damage is done yet.

  b = 5; 
  // That doesn't compile. Error given is: 
  //   No overload for '='  note: candidate function not viable: 
  //   expects an rvalue for object argument

  std::move(b) = 6; 
  // That compiles, but is basically casting around the 
  // protections, aka shooting yourself in the foot.
}

Solution 2:

Yes, this is indeed a problem. Afaik there is no solution to this in current C++ (C++20 at the time of writing) other than to change the code at the calling site which is not ideal.

There is the proposal P0672R0 Implicit Evaluation of “auto” Variables (from 2017) which tries to deal with this exact problem. It uses as example proxy classes in math libraries just like your case and gives example with std::vector<bool> just like you. And it gives more examples of problems that arrive from this pattern.

The paper proposes 3 solutions to this, all implemented in language:

  • operator notation:

    class product_expr
    {
        matrix operator auto() { ... }
    };
    
  • using declaration:

    class product_expr
    {
        using auto = matrix;
    };
    
  • specialization of decay:

    make auto x= expr be defined as typename std::decay<decltype(expr)>::type x=expr; and then use used can specialize std::decay

The discussion at the standard committee meetings strongly favored the using declaration solution. However I wasn't able to find any more updates on this paper so I personally don't see this paper or something similar being implemented in the language in the near future.

So, unfortunately, for now your only solution is to educate your users on the proxy classes your library uses.

Solution 3:

I have an obscure idea, not sure how practical it is. It doesn't override what auto deduces to (which seems impossible), but merely causes copying the proxy to a variable ill-formed.

  • Make the proxy non-copyable.

  • That alone doesn't stop you from saving the proxy to an auto variable, due to mandatory RVO. To counteract it, you return the proxy by reference instead.

  • To avoid getting a dangling reference, you construct the proxy in a default function argument, which have the same lifetime as regular arguments - until the end of a full expression.

  • User could still save a reference to the proxy. To make this kind of misuse harder, you return an rvalue-reference, and &&-qualify all member functions.

  • This prevents any interaction with the dangling proxy reference, unless you std::move it. This should be obscure enough to stop your users, but if not, you'll have to rely on some sanitizer or set a (volatile?) flag in the destructor of the proxy, and check it each time you access it (which is UB, but should be reliable enough).

Example:

namespace impl
{
    class Proxy
    {
        Proxy() {}
      public:
        static Proxy construct() {return {};}
        Proxy(const Proxy &) = delete;
        Proxy &operator=(const Proxy &) = delete;
        int *ptr = nullptr;
        int operator=(int value) && {return *ptr = value;}
    };
}

impl::Proxy &&make_proxy(int &target, impl::Proxy &&proxy = impl::Proxy::construct())
{
    proxy.ptr = &target;
    return std::move(proxy);
}

Then:

int x = 0;
make_proxy(x) = 1; // Works.
auto a = make_proxy(x); // error: call to deleted constructor of 'impl::Proxy'
auto &b = make_proxy(x); // error: non-const lvalue reference to type 'impl::Proxy' cannot bind to a temporary of type 'impl::Proxy'
const auto &c = make_proxy(x); // Compiles, is a dangling reference. BUT!
c = 2; // error: no viable overloaded '='
auto &&d = make_proxy(x); // Compiles, is a dangling reference.
d = 3; // error: no viable overloaded '='
std::move(d) = 2; // Compiles, causes UB. Needs a runtime check.

This is harder to pull off with overloaded operators, which (except ()) can't have default arguments. But still doable:

namespace impl
{
    struct Index
    {
        int value = 0;
        Proxy &&proxy;
        Index(int value, Proxy &&proxy = Proxy::construct()) : value(value), proxy(std::move(proxy)) {}
        Index(const Index &) = delete;
        Index &operator=(const Index &) = delete;
    };
}

struct B
{
    int x = 0;
    impl::Proxy &&operator[](impl::Index index)
    {
        index.proxy.ptr = &x;
        return std::move(index.proxy);
    }
};

The only downside is that since at most one user-defined implicit conversion is allowed for any argument, this operator[] will only work with int arguments, and not with classes with operator int.