C++11 lambda implementation and memory model
I would like some information on how to correctly think about C++11 closures and std::function
in terms of how they are implemented and how memory is handled.
Although I don't believe in premature optimisation, I do have a habit of carefully considering the performance impact of my choices while writing new code. I also do a fair amount of real-time programming, e.g. on microcontrollers and for audio systems, where non-deterministic memory allocation/deallocation pauses are to be avoided.
Therefore I'd like to develop a better understanding of when to use or not use C++ lambdas.
My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack. When a value-closure must be returned from a function, one wraps it in std::function
. What happens to the closure memory in this case? Is it copied from the stack to the heap? Is it freed whenever the std::function
is freed, i.e., is it reference-counted like a std::shared_ptr
?
I imagine that in a real-time system I could set up a chain of lambda functions, passing B as a continuation argument to A, so that a processing pipeline A->B
is created. In this case, the A and B closures would be allocated once. Although I'm not sure whether these would be allocated on the stack or the heap. However in general this seems safe to use in a real-time system. On the other hand if B constructs some lambda function C, which it returns, then the memory for C would be allocated and deallocated repeatedly, which would not be acceptable for real-time usage.
In pseudo-code, a DSP loop, which I think is going to be real-time safe. I want to perform processing block A and then B, where A calls its argument. Both these functions return std::function
objects, so f
will be a std::function
object, where its environment is stored on the heap:
auto f = A(B); // A returns a function which calls B
// Memory for the function returned by A is on the heap?
// Note that A and B may maintain a state
// via mutable value-closure!
for (t=0; t<1000; t++) {
y = f(t)
}
And one which I think might be bad to use in real-time code:
for (t=0; t<1000; t++) {
y = A(B)(t);
}
And one where I think stack memory is likely used for the closure:
freq = 220;
A = 2;
for (t=0; t<1000; t++) {
y = [=](int t){ return sin(t*freq)*A; }
}
In the latter case the closure is constructed at each iteration of the loop, but unlike the previous example it is cheap because it is just like a function call, no heap allocations are made. Moreover, I wonder if a compiler could "lift" the closure and make inlining optimisations.
Is this correct? Thank you.
Solution 1:
My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack.
No; it is always a C++ object with an unknown type, created on the stack. A capture-less lambda can be converted into a function pointer (though whether it is suitable for C calling conventions is implementation dependent), but that doesn't mean it is a function pointer.
When a value-closure must be returned from a function, one wraps it in std::function. What happens to the closure memory in this case?
A lambda isn't anything special in C++11. It's an object like any other object. A lambda expression results in a temporary, which can be used to initialize a variable on the stack:
auto lamb = []() {return 5;};
lamb
is a stack object. It has a constructor and destructor. And it will follow all of the C++ rules for that. The type of lamb
will contain the values/references that are captured; they will be members of that object, just like any other object members of any other type.
You can give it to a std::function
:
auto func_lamb = std::function<int()>(lamb);
In this case, it will get a copy of the value of lamb
. If lamb
had captured anything by value, there would be two copies of those values; one in lamb
, and one in func_lamb
.
When the current scope ends, func_lamb
will be destroyed, followed by lamb
, as per the rules of cleaning up stack variables.
You could just as easily allocate one on the heap:
auto func_lamb_ptr = new std::function<int()>(lamb);
Exactly where the memory for the contents of a std::function
goes is implementation-dependent, but the type-erasure employed by std::function
generally requires at least one memory allocation. This is why std::function
's constructor can take an allocator.
Is it freed whenever the std::function is freed, i.e., is it reference-counted like a std::shared_ptr?
std::function
stores a copy of its contents. Like virtually every standard library C++ type, function
uses value semantics. Thus, it is copyable; when it is copied, the new function
object is completely separate. It is also moveable, so any internal allocations can be transferred appropriately without needing more allocating and copying.
Thus there is no need for reference counting.
Everything else you state is correct, assuming that "memory allocation" equates to "bad to use in real-time code".
Solution 2:
C++ lambda is just a syntactic sugar around (anonymous) Functor class with overloaded operator()
and std::function
is just a wrapper around callables (i.e functors, lambdas, c-functions, ...) which does copy by value the "solid lambda object" from the current stack scope - to the heap.
To test the number of actual constructors/relocatons I made a test (using another level of wrapping to shared_ptr but its not the case). See for yourself:
#include <memory>
#include <string>
#include <iostream>
class Functor {
std::string greeting;
public:
Functor(const Functor &rhs) {
this->greeting = rhs.greeting;
std::cout << "Copy-Ctor \n";
}
Functor(std::string _greeting="Hello!"): greeting { _greeting } {
std::cout << "Ctor \n";
}
Functor & operator=(const Functor & rhs) {
greeting = rhs.greeting;
std::cout << "Copy-assigned\n";
return *this;
}
virtual ~Functor() {
std::cout << "Dtor\n";
}
void operator()()
{
std::cout << "hey" << "\n";
}
};
auto getFpp() {
std::shared_ptr<std::function<void()>> fp = std::make_shared<std::function<void()>>(Functor{}
);
(*fp)();
return fp;
}
int main() {
auto f = getFpp();
(*f)();
}
it makes this output:
Ctor
Copy-Ctor
Copy-Ctor
Dtor
Dtor
hey
hey
Dtor
Exactly same set of ctors/dtors would be called for the stack-allocated lambda object! (Now it calls Ctor for stack allocation, Copy-ctor (+ heap alloc) to construct it in std::function and another one for making shared_ptr heap allocation + construction of function)