What is the most efficient way to append one std::vector to the end of another?
Let v1 be the target vector, v2 needs to be appended to the back of it.
I'm now doing:
v1.reserve(v1.size() + v2.size());
copy(v2.begin(), v2.end(), back_inserter(v1));
Is this the most efficient way? Or can it maybe be done just via copying a chunk of memory? Thanks!
After a lot of arguing (and a reasonable comment from Matthieu M. and villintehaspam), I'll change my suggestion to
v1.insert( v1.end(), v2.begin(), v2.end() );
I'll keep the former suggestion here:
v1.reserve( v1.size() + v2.size() );
v1.insert( v1.end(), v2.begin(), v2.end() );
There are some reasons to do it the latter way, although none of them enough strong:
- there is no guarantee on to what size will the vector be reallocated -- e.g. if the sum size is 1025, it may get reallocated to 2048 -- dependant on implementation. There is no such guarantee for
reserve
either, but for a specific implementation it might be true. If hunting for a bottleneck it might be rasonable to check that. - reserve states our intentions clear -- optimization may be more efficient in this case (reserve could prepare the cache in some top-notch implementation).
- also, with
reserve
we have a C++ Standard guarantee that there will be only a single reallocation, whileinsert
might be implemented inefficiently and do several reallocations (also something to test with a particular implementation).
Probably better and simpler to use a dedicated method: vector.insert
v1.insert(v1.end(), v2.begin(), v2.end());
As Michael mentions, unless the iterators are input iterators, the vector will figure out the required size and copy appended data at one go with linear complexity.
I simply did a quick performance measurement with the following code and
v1.insert( v1.end(), v2.begin(), v2.end() );
seems to be the right choice (as already stated above). Nevertheless, you find the reported performance below.
Test code:
#include <vector>
#include <string>
#include <boost/timer/timer.hpp>
//==============================================================================
//
//==============================================================================
/// Returns a vector containing the sequence [ 0, ... , n-1 ].
inline std::vector<int> _range(const int n)
{
std::vector<int> tmp(n);
for(int i=0; i<n; i++)
tmp[i] = i;
return tmp;
}
void test_perf_vector_append()
{
const vector<int> testdata1 = _range(100000000);
const vector<int> testdata2 = _range(100000000);
vector<int> testdata;
printf("--------------------------------------------------------------\n");
printf(" METHOD: push_back()\n");
printf("--------------------------------------------------------------\n");
testdata.clear();
{ vector<int>().swap(testdata); }
testdata = testdata1;
{
boost::timer::auto_cpu_timer t;
for(size_t i=0; i<testdata2.size(); i++)
{
testdata.push_back(testdata2[i]);
}
}
printf("--------------------------------------------------------------\n");
printf(" METHOD: reserve() + push_back()\n");
printf("--------------------------------------------------------------\n");
testdata.clear();
{ vector<int>().swap(testdata); }
testdata = testdata1;
{
boost::timer::auto_cpu_timer t;
testdata.reserve(testdata.size() + testdata2.size());
for(size_t i=0; i<testdata2.size(); i++)
{
testdata.push_back(testdata2[i]);
}
}
printf("--------------------------------------------------------------\n");
printf(" METHOD: insert()\n");
printf("--------------------------------------------------------------\n");
testdata.clear();
{ vector<int>().swap(testdata); }
testdata = testdata1;
{
boost::timer::auto_cpu_timer t;
testdata.insert( testdata.end(), testdata2.begin(), testdata2.end() );
}
printf("--------------------------------------------------------------\n");
printf(" METHOD: reserve() + insert()\n");
printf("--------------------------------------------------------------\n");
testdata.clear();
{ vector<int>().swap(testdata); }
testdata = testdata1;
{
boost::timer::auto_cpu_timer t;
testdata.reserve( testdata.size() + testdata.size() );
testdata.insert( testdata.end(), testdata2.begin(), testdata2.end() );
}
printf("--------------------------------------------------------------\n");
printf(" METHOD: copy() + back_inserter()\n");
printf("--------------------------------------------------------------\n");
testdata.clear();
{ vector<int>().swap(testdata); }
testdata = testdata1;
{
boost::timer::auto_cpu_timer t;
testdata.reserve(testdata.size() + testdata2.size());
copy(testdata2.begin(), testdata2.end(), back_inserter(testdata));
}
printf("--------------------------------------------------------------\n");
printf(" METHOD: reserve() + copy() + back_inserter()\n");
printf("--------------------------------------------------------------\n");
testdata.clear();
{ vector<int>().swap(testdata); }
testdata = testdata1;
{
boost::timer::auto_cpu_timer t;
testdata.reserve(testdata.size() + testdata2.size());
copy(testdata2.begin(), testdata2.end(), back_inserter(testdata));
}
}
With Visual Studio 2008 SP1, x64, Release mode, /O2 /LTCG the output is as follows:
--------------------------------------------------------------
METHOD: push_back()
--------------------------------------------------------------
0.933077s wall, 0.577204s user + 0.343202s system = 0.920406s CPU (98.6%)
--------------------------------------------------------------
METHOD: reserve() + push_back()
--------------------------------------------------------------
0.612753s wall, 0.452403s user + 0.171601s system = 0.624004s CPU (101.8%)
--------------------------------------------------------------
METHOD: insert()
--------------------------------------------------------------
0.424065s wall, 0.280802s user + 0.140401s system = 0.421203s CPU (99.3%)
--------------------------------------------------------------
METHOD: reserve() + insert()
--------------------------------------------------------------
0.637081s wall, 0.421203s user + 0.218401s system = 0.639604s CPU (100.4%)
--------------------------------------------------------------
METHOD: copy() + back_inserter()
--------------------------------------------------------------
0.743658s wall, 0.639604s user + 0.109201s system = 0.748805s CPU (100.7%)
--------------------------------------------------------------
METHOD: reserve() + copy() + back_inserter()
--------------------------------------------------------------
0.748560s wall, 0.624004s user + 0.124801s system = 0.748805s CPU (100.0%)