How do static variables in lambda function objects work?

Are static variables used in a lambda retained across calls of the function wherein the lambda is used? Or is the function object "created" again each function call?

Useless Example:

#include <iostream>
#include <vector>
#include <algorithm>

using std::cout;

void some_function()
{
    std::vector<int> v = {0,1,2,3,4,5};
    std::for_each( v.begin(), v.end(),
         [](const int &i)
         {
             static int calls_to_cout = 0;
             cout << "cout has been called " << calls_to_cout << " times.\n"
                  << "\tCurrent int: " << i << "\n";
             ++calls_to_cout;
         } );
}

int main()
{
    some_function();
    some_function();
}

What is the correct output for this program? Is it dependent on the fact if the lambda captures local variables or not? (it will certainly change the underlying implementation of the function object, so it might have an influence) Is it an allowed behavioural inconsistency?

I'm not looking for: "My compiler outputs ...", this is too new a feature to trust current implementations IMHO. I know asking for Standard quotes seems to be popular since the world discovered such a thing exists, but still, I would like a decent source.


Solution 1:

tl;dr version at the bottom.


§5.1.2 [expr.prim.lambda]

p1 lambda-expression:
lambda-introducer lambda-declaratoropt compound-statement

p3 The type of the lambda-expression (which is also the type of the closure object) is a unique, unnamed nonunion class type — called the closure type — whose properties are described below. This class type is not an aggregate (8.5.1). The closure type is declared in the smallest block scope, class scope, or namespace scope that contains the corresponding lambda-expression. (My note: Functions have a block scope.)

p5 The closure type for a lambda-expression has a public inline function call operator [...]

p7 The lambda-expression’s compound-statement yields the function-body (8.4) of the function call operator [...]

Since the compound-statement is directly taken as the function call operator's body, and the closure type is defined in the smallest (innermost) scope, it's the same as writing the following:

void some_function()
{
    struct /*unnamed unique*/{
      inline void operator()(int const& i) const{
        static int calls_to_cout = 0;
        cout << "cout has been called " << calls_to_cout << " times.\n"
             << "\tCurrent int: " << i << "\n";
        ++calls_to_cout;

      }
    } lambda;
    std::vector<int> v = {0,1,2,3,4,5};
    std::for_each( v.begin(), v.end(), lambda);
}

Which is legal C++, functions are allowed to have static local variables.

§3.7.1 [basic.stc.static]

p1 All variables which do not have dynamic storage duration, do not have thread storage duration, and are not local have static storage duration. The storage for these entities shall last for the duration of the program.

p3 The keyword static can be used to declare a local variable with static storage duration. [...]

§6.7 [stmt.dcl] p4
(This deals with initialization of variables with static storage duration in a block scope.)

[...] Otherwise such a variable is initialized the first time control passes through its declaration; [...]


To reiterate:

  • The type of a lambda expression is created in the innermost scope.
  • It is not created anew for each function call (that wouldn't make sense, since the enclosing function body would be as my example above).
  • It obeys (nearly) all the rules of normal classes / structs (just some stuff about this is different), since it is a non-union class type.

Now that we have assured that for every function call, the closure type is the same, we can safely say that the static local variable is also the same; it's initialized the first time the function call operator is invoked and lives until the end of the program.

Solution 2:

The static variable should behave just like it would in a function body. However there's little reason to use one, since a lambda object can have member variables.

In the following, calls_to_cout is captured by value, which gives the lambda a member variable with the same name, initialized to the current value of calls_to_cout. This member variable retains its value across calls but is local to the lambda object, so any copies of the lambda will get their own calls_to_cout member variable instead of all sharing one static variable. This is much safer and better.

(and since lambdas are const by default and this lambda modifies calls_to_cout it must be declared as mutable.)

void some_function()
{
    vector<int> v = {0,1,2,3,4,5};
    int calls_to_cout = 0;
    for_each(v.begin(), v.end(),[calls_to_cout](const int &i) mutable
    {
        cout << "cout has been called " << calls_to_cout << " times.\n"
          << "\tCurrent int: " << i << "\n";
        ++calls_to_cout;
    });
}

If you do want a single variable to be shared between instances of the lambda you're still better off using captures. Just capture some kind of reference to the variable. For example here's a function that returns a pair of functions which share a reference to a single variable, and each function performs its own operation on that shared variable when called.

std::tuple<std::function<int()>,std::function<void()>>
make_incr_reset_pair() {
    std::shared_ptr<int> i = std::make_shared<int>(0);
    return std::make_tuple(
      [=]() { return ++*i; },
      [=]() { *i = 0; });
}

int main() {
    std::function<int()> increment;
    std::function<void()> reset;
    std::tie(increment,reset) = make_incr_reset_pair();

    std::cout << increment() << '\n';
    std::cout << increment() << '\n';
    std::cout << increment() << '\n';
    reset();
    std::cout << increment() << '\n';

Solution 3:

A static can be constructed in the capture:-

auto v = vector<int>(99);
generate(v.begin(), v.end(), [x = int(1)] () mutable { return x++; });

The lambda can made by another lambda

auto inc = [y=int(1)] () mutable { 
    ++y; // has to be separate, it doesn't like ++y inside the []
    return [y, x = int(1)] () mutable { return y+x++; }; 
};
generate(v.begin(), v.end(), inc());

Here, y can also be captured by reference as long as inc lasts longer.

Solution 4:

I do not have a copy of the final standard, and the draft does not appear to address the issue explicitly (see section 5.1.2, starting on page 87 of the PDF). But it does say that a lambda expression evaluates to a single object of closure type, which may be invoked repeatedly. That being so, I believe the standard requires that static variables be initialized once and only once, just as though you'd written out the class, operator(), and variable capture by hand.

But as you say, this is a new feature; at least for now you're stuck with whatever your implementation does, no matter what the standard says. It's better style to explicitly capture a variable in the enclosing scope anyway.

Solution 5:

There are two ways to use states with lambdas.

  1. Defining the variable as static in the lambda: the variable is persistent over lambda calls and lambda instantiations.
  2. Defining the variable in the lambda capture and mark the lambda as mutable: the variable is persistent over lambda calls but it is reset at every and lambda instantiations

The following code illustrate the difference:

void foo() {
   auto f = [k=int(1)]() mutable { cout << k++ << "\n";}; // define k in the capture
   f();
   f();
}

void bar() {
   auto f = []() { static int k = 1; cout << k++ << "\n";}; // define k as static
   f();
   f();
}

void test() {
   foo();
   foo();  // k is reset every time the lambda is created
   bar();
   bar();  // k is persistent through lambda instantiations
   return 0;
}