Why do I need to explicitly write the 'auto' keyword?

Dropping the explicit auto would break the language:

e.g.

int main()
{
    int n;
    {
        auto n = 0; // this shadows the outer n.
    }
}

where you can see that dropping the auto would not shadow the outer n.


Your question allows two interpretations:

  • Why do we need 'auto' at all? Can't we simply drop it?
  • Why are we obliged to use auto? Can't we just have it implicit, if it is not given?

Bathsheba answered nicely the first interpretation, for the second, consider the following (assuming no other declarations exist so far; hypothetically valid C++):

int f();
double g();

n = f(); // declares a new variable, type is int;
d = g(); // another new variable, type is double

if(n == d)
{
    n = 7; // reassigns n
    auto d = 2.0; // new d, shadowing the outer one
}

It would be possible, other languages get away quite well with (well, apart from the shadowing issue perhaps)... It is not so in C++, though, and the question (in the sense of the second interpretation) now is: Why?

This time, the answer is not as evident as in the first interpretation. One thing is obvious, though: The explicit requirement for the keyword makes the language safer (I do not know if this is what drove the language committee to its decision, still it remains a point):

grummel = f();

// ...

if(true)
{
    brummel = f();
  //^ uh, oh, a typo...
}

Can we agree on this not needing any further explanations?

The even bigger danger in not requiring auto, [however], is that it means that adding a global variable in a place far away from a function (e.g. in a header file) can turn what was intended to be the declaration of a locally-scoped variable in that function into an assignment to the global variable... with potentially disastrous (and certainly very confusing) consequences.

(cited psmears' comment due to its importance - thanks for hinting to)


was it not possible to achieve the same outcome without explicitly declaring a variable auto?

I am going to rephrase your question slightly in a way that will help you understand why you need auto:

Was it not possible to achieve the same outcome without explicitly using a type placeholder?

Was it not possible? Of course it was "possible". The question is whether it would be worth the effort to do it.

Most syntaxes in other languages that do not typenames work in one of two ways. There's the Go-like way, where name := value; declares a variable. And there's the Python-like way, where name = value; declares a new variable if name has not previously been declared.

Let's assume that there are no syntactic issues with applying either syntax to C++ (even though I can already see that identifier followed by : in C++ means "make a label"). So, what do you lose compared to placeholders?

Well, I can no longer do this:

auto &name = get<0>(some_tuple);

See, auto always means "value". If you want to get a reference, you need to explicitly use a &. And it will rightly fail to compile if the assignment expression is a prvalue. Neither of the assignment-based syntaxes has a way to differentiate between references and values.

Now, you could make such assignment syntaxes deduce references if the given value is a reference. But that would mean that you can't do:

auto name = get<0>(some_tuple);

This copies from the tuple, creating an object independent of some_tuple. Sometimes, that's exactly what you want. This is even more useful if you want to move from the tuple with auto name = get<0>(std::move(some_tuple));.

OK, so maybe we could extend these syntaxes a bit to account for this distinction. Maybe &name := value; or &name = value; would mean to deduce a reference like auto&.

OK, fine. What about this:

decltype(auto) name = some_thing();

Oh that's right; C++ actually has two placeholders: auto and decltype(auto). The basic idea of this deduction is that it works exactly as if you had done decltype(expr) name = expr;. So in our case, if some_thing() is an object, it will deduce an object. If some_thing() is a reference, it will deduce a reference.

This is very useful when you're working in template code and are not sure exactly what the return value of a function will be. This is great for forwarding, and it is an essential tool, even if it is not widely used.

So now we need to add more to our syntax. name ::= value; means "do what decltype(auto) does". I don't have an equivalent for the Pythonic variant.

Looking at this syntax, isn't that rather easy to accidentally mis-type? Not only that, it's hardly self-documenting. Even if you've never seen decltype(auto) before, it's big and obvious enough that you can at least easily tell that there's something special going on. Whereas the visual difference between ::= and := is minimal.

But that's opinion stuff; there are more substantive issues. See, all of this is based on using assignment syntax. Well... what about places where you can't use assignment syntax? Like this:

for(auto &x : container)

Do we change that to for(&x := container)? Because that seems to be saying something very different from range-based for. It looks like it's the initializer statement from a regular for loop, not a range-based for. It would also be a different syntax from non-deduced cases.

Also, copy-initialization (using =) is not the same thing in C++ as direct-initialization (using constructor syntax). So name := value; may not work in cases where auto name(value) would have.

Sure, you could declare that := will use direct-initialization, but that would be quite in-congruent with the way the rest of C++ behaves.

Also, there's one more thing: C++14. It gave us one useful deduction feature: return type deduction. But this is based on placeholders. So much like range-based for, it is fundamentally based on a typename that gets filled in by the compiler, not by some syntax applied to a particular name and expression.

See, all of these problems come from the same source: you're inventing entirely new syntax for declaring variables. Placeholder-based declarations didn't have to invent new syntax. They're using the exact same syntax as before; they're just employing a new keyword that acts like a type, but has a special meaning. This is what allows it to work in range-based for and for return type deduction. It is what allows it to have multiple forms (auto vs. decltype(auto)). And so forth.

Placeholders work because they are the simplest solution to the problem, while simultaneously retaining all of the benefits and generality of using an actual type name. If you came up with another alternative that worked as universally as placeholders do, it is highly unlikely that it would be as simple as placeholders.

Unless it was just spelling placeholders with different keywords or symbols...