Looking for C++ STL-like vector class but using stack storage

Before I write my own I will ask all y'all.

I'm looking for a C++ class that is almost exactly like a STL vector but stores data into an array on the stack. Some kind of STL allocator class would work also, but I am trying to avoid any kind of heap, even static allocated per-thread heaps (although one of those is my second choice). The stack is just more efficient.

It needs to be almost a drop in replacement for current code that uses a vector.

For what I was about to write myself I was thinking of something like this:

char buffer[4096];
stack_vector<match_item> matches(buffer, sizeof(buffer));

Or the class could have buffer space allocated internally. Then it would look like:

stack_vector<match_item, 256> matches;

I was thinking it would throw std::bad_alloc if it runs out of space, although that should not ever happen.

Update

Using Chromium's stack_container.h works great!

The reason I hadn't thought of doing it this way myself is that I have always overlooked the allocator object parameter to the STL collection constructors. I have used the template parameter a few times to do static pools but I'd never seen code or written any that actually used the object parameter. I learned something new. Very cool!

The code is a bit messy and for some reason GCC forced me to declare the allocator as an actual item instead of constructing it into vector's allocator parameter. It went from something like this:

typedef std::pair< const char *, const char * > comp_list_item;
typedef std::vector< comp_list_item > comp_list_type;

comp_list_type match_list;
match_list.reserve(32);

To this:

static const size_t comp_list_alloc_size = 128;
typedef std::pair< const char *, const char * > comp_list_item;
typedef StackAllocator< comp_list_item, comp_list_alloc_size > comp_list_alloc_type;
typedef std::vector< comp_list_item, comp_list_alloc_type > comp_list_type;

comp_list_alloc_type::Source match_list_buffer;
comp_list_alloc_type match_list_alloc( &match_list_buffer );
comp_list_type match_list( match_list_alloc );
match_list.reserve( comp_list_alloc_size );

And I have to repeat that whenever I declare a new one. But it works just like I wanted.

I noticed that stack_container.h has a StackVector defined and I tried using it. But it doesn't inherit from vector or define the same methods so it wasn't a drop-in replacement. I didn't want to rewrite all the code using the vector so I gave up on it.


Solution 1:

You don't have to write a completely new container class. You can stick with your STL containers, but change the second parameter of for example std::vector to give it your custom allocator which allocates from a stack-buffer. The chromium authors wrote an allocator just for this:

https://chromium.googlesource.com/chromium/chromium/+/master/base/stack_container.h

It works by allocating a buffer where you say how big it is. You create the container and call container.reserve(buffer_size);. If you overflow that size, the allocator will automatically get elements from the heap (since it is derived from std::allocator, it will in that case just use the facilities of the standard allocator). I haven't tried it, but it looks like it's from google so i think it's worth a try.

Usage is like this:

StackVector<int, 128> s;
s->push_back(42); // overloaded operator->
s->push_back(43);

// to get the real std::vector. 
StackVector<int, 128>::ContainerType & v = s.container();
std::cout << v[0] << " " << v[1] << std::endl;

Solution 2:

It seems that boost::static_vector is what you are searching. From the documentation:

static_vector is an hybrid between vector and array: like vector, it's a sequence container with contiguous storage that can change in size, along with the static allocation, low overhead, and fixed capacity of array. static_vector is based on Adam Wulkiewicz and Andrew Hundt's high-performance varray class.

The number of elements in a static_vector may vary dynamically up to a fixed capacity because elements are stored within the object itself similarly to an array.

Solution 3:

Some options you may want to look at:

STLSoft by Matthew Wilson (author of Imperfect C++) has an auto_buffer template class that puts a default array on the stack but if it grows larger than the stack allocation will grab the memory from the heap. I like this class - if you know that your container sizes are generally going to be bounded by a rather low limit, then you get the speed of a local, stack allocated array. However, for the corner cases where you need more memory, it all still works properly.

http://www.stlsoft.org/doc-1.9/classstlsoft_1_1auto__buffer.html

Note that the implementation I use myself is not STLSoft's, but an implementation that borrows heavily from it.

"The Lazy Programmer" did a post for an implementation of a container that uses alloca() for the storage. I'm not a fan of this technique, but I'll let you decide for yourself if it's what you want:

http://tlzprgmr.wordpress.com/2008/04/02/c-how-to-create-variable-length-arrays-on-the-stack/

Then there's boost::array which has none of the dynamic sizing behavior of the first two, but gives you more of the vector interface than just using pointers as iterators that you get with built-in arrays (ie., you get begin(), end(), size(), etc.):

http://www.boost.org/doc/libs/1_37_0/doc/html/boost/array.html

Solution 4:

If speed matters, I see run times

  • 4 ns int[10], fixed size on the stack
  • 40 ns <vector>
  • 1300 ns <stlsoft/containers/pod_vector.hpp>

for one stupid test below -- just 2 push, v[0] v[1], 2 pop, on one platform, mac ppc, gcc-4.2 -O3 only. (I have no idea if Apple have optimized their stl.)

Don't accept any timings you haven't faked yourself. And of course every usage pattern is different. Nonetheless factors > 2 surprise me.

(If mems, memory accesses, are the dominant factor in runtimes, what are all the extra mems in the various implementations ?)

#include <stlsoft/containers/pod_vector.hpp>
#include <stdio.h>
using namespace std;

int main( int argc, char* argv[] )
{
        // times for 2 push, v[0] v[1], 2 pop, mac g4 ppc gcc-4.2 -O3 --
    // Vecint10 v;  // stack int[10]: 4 ns
    vector<int> v;  // 40 ns
    // stlsoft::pod_vector<int> v;  // 1300 ns
    // stlsoft::pod_vector<int, std::allocator<int>, 64> v;

    int n = (argv[1] ? atoi( argv[1] ) : 10) * 1000000;
    int sum = 0;

    while( --n >= 0 ){
        v.push_back( n );
        v.push_back( n );
        sum += v[0] + v[1];
        v.pop_back();
        v.pop_back();
    }
    printf( "sum: %d\n", sum );

}

Solution 5:

You can use your own allocator for std::vector and have it allocate chunks of your stack-based storage, similar to your example. The allocator class is the second part of the template.

Edit: I've never tried this, and looking at the documentation further leads me to believe you can't write your own allocator. I'm still looking into it.