How can I capture the value of an outer variable inside a lambda expression?

I just encountered the following behavior:

for (var i = 0; i < 50; ++i) {
    Task.Factory.StartNew(() => {
        Debug.Print("Error: " + i.ToString());
    });
}

Will result in a series of "Error: x", where most of the x are equal to 50.

Similarly:

var a = "Before";
var task = new Task(() => Debug.Print("Using value: " + a));
a = "After";
task.Start();

Will result in "Using value: After".

This clearly means that the concatenation in the lambda expression does not occur immediately. How is it possible to use a copy of the outer variable in the lambda expression, at the time the expression is declared? The following will not work better (which is not necessarily incoherent, I admit):

var a = "Before";
var task = new Task(() => {
    var a2 = a;
    Debug.Print("Using value: " + a2);
});
a = "After";
task.Start();

This has more to do with lambdas than threading. A lambda captures the reference to a variable, not the variable's value. This means that when you try to use i in your code, its value will be whatever was stored in i last.

To avoid this, you should copy the variable's value to a local variable when the lambda starts. The problem is, starting a task has overhead and the first copy may be executed only after the loop finishes. The following code will also fail

for (var i = 0; i < 50; ++i) {
    Task.Factory.StartNew(() => {
        var i1=i;
        Debug.Print("Error: " + i1.ToString());
    });
}

As James Manning noted, you can add a variable local to the loop and copy the loop variable there. This way you are creating 50 different variables to hold the value of the loop variable, but at least you get the expected result. The problem is, you do get a lot of additional allocations.

for (var i = 0; i < 50; ++i) {
    var i1=i;
    Task.Factory.StartNew(() => {
        Debug.Print("Error: " + i1.ToString());
    });
}

The best solution is to pass the loop parameter as a state parameter:

for (var i = 0; i < 50; ++i) {
    Task.Factory.StartNew(o => {
        var i1=(int)o;
        Debug.Print("Error: " + i1.ToString());
    }, i);
}

Using a state parameter results in fewer allocations. Looking at the decompiled code:

the second snippet will create 50 closures and 50 delegates
the third snippet will create 50 boxed ints but only a single delegate

That's because you are running the code in a new thread, and the main thread immediately goes on to change the variable. If the lambda expression were executed immediately, the entire point of using a task would be lost.

The thread doesn't get its own copy of the variable at the time the task is created, all the tasks use the same variable (which actually is stored in the closure for the method, it's not a local variable).

Lambda expressions do capture not the value of the outer variable but a reference to it. That is the reason why you do see 50 or After in your tasks.

To solve this create before your lambda expression a copy of it to capture it by value.

This unfortunate behaviour will be fixed by the C# compiler with .NET 4.5 until then you need to live with this oddity.

Example:

    List<Action> acc = new List<Action>();
    for (int i = 0; i < 10; i++)
    {
        int tmp = i;
        acc.Add(() => { Console.WriteLine(tmp); });
    }

    acc.ForEach(x => x());

Lambda expressions are by definition lazily evaluated so they will not be evaluated until actually called. In your case by the task execution. If you close over a local in your lambda expression the state of the local at the time of execution will be reflected. Which is what you see. You can take advantage of this. E.g. your for loop really don't need a new lambda for every iteration assuming for the sake of the example that the described result was what you intended you could write

var i =0;
Action<int> action = () => Debug.Print("Error: " + i);
for(;i<50;+i){
    Task.Factory.StartNew(action);
}

on the other hand if you wished that it actually printed "Error: 1"..."Error 50" you could change the above to

var i =0;
Func<Action<int>> action = (x) => { return () => Debug.Print("Error: " + x);}
for(;i<50;+i){
    Task.Factory.StartNew(action(i));
}

The first closes over i and will use the state of i at the time the Action is executed and the state is often going to be the state after the loop finishes. In the latter case i is evaluated eagerly because it's passed as an argument to a function. This function then returns an Action<int> which is passed to StartNew.

So the design decision makes both lazily evaluation and eager evaluation possible. Lazily because locals are closed over and eagerly because you can force locals to be executed by passing them as an argument or as shown below declaring another local with a shorter scope

for (var i = 0; i < 50; ++i) {
    var j = i;
    Task.Factory.StartNew(() => Debug.Print("Error: " + j));
}

All the above is general for Lambdas. In the specific case of StartNew there's actually an overload that does what the second example does so that can be simplified to

var i =0;
Action<object> action = (x) => Debug.Print("Error: " + x);}
for(;i<50;+i){
    Task.Factory.StartNew(action,i);
}

How can I capture the value of an outer variable inside a lambda expression?

Related

Recent Posts