Calculate mean and standard deviation from a vector of samples in C++ using Boost

Is there a way to calculate mean and standard deviation for a vector containing samples using Boost?

Or do I have to create an accumulator and feed the vector into it?


Solution 1:

I don't know if Boost has more specific functions, but you can do it with the standard library.

Given std::vector<double> v, this is the naive way:

#include <numeric>

double sum = std::accumulate(v.begin(), v.end(), 0.0);
double mean = sum / v.size();

double sq_sum = std::inner_product(v.begin(), v.end(), v.begin(), 0.0);
double stdev = std::sqrt(sq_sum / v.size() - mean * mean);

This is susceptible to overflow or underflow for huge or tiny values. A slightly better way to calculate the standard deviation is:

double sum = std::accumulate(v.begin(), v.end(), 0.0);
double mean = sum / v.size();

std::vector<double> diff(v.size());
std::transform(v.begin(), v.end(), diff.begin(),
               std::bind2nd(std::minus<double>(), mean));
double sq_sum = std::inner_product(diff.begin(), diff.end(), diff.begin(), 0.0);
double stdev = std::sqrt(sq_sum / v.size());

UPDATE for C++11:

The call to std::transform can be written using a lambda function instead of std::minus and std::bind2nd(now deprecated):

std::transform(v.begin(), v.end(), diff.begin(), [mean](double x) { return x - mean; });

Solution 2:

If performance is important to you, and your compiler supports lambdas, the stdev calculation can be made faster and simpler: In tests with VS 2012 I've found that the following code is over 10 X quicker than the Boost code given in the chosen answer; it's also 5 X quicker than the safer version of the answer using standard libraries given by musiphil.

Note I'm using sample standard deviation, so the below code gives slightly different results (Why there is a Minus One in Standard Deviations)

double sum = std::accumulate(std::begin(v), std::end(v), 0.0);
double m =  sum / v.size();

double accum = 0.0;
std::for_each (std::begin(v), std::end(v), [&](const double d) {
    accum += (d - m) * (d - m);
});

double stdev = sqrt(accum / (v.size()-1));