incremental computation of standard deviation
How can I compute the standard deviation in an incremental way (using the new value and the last computed mean and/or std deviation) ?
for the non incremental way, I just do something like: $$S_N=\sqrt{\frac1N\sum_{i=1}^N(x_i-\overline{x})^2}.$$
mean = Mean(list)
for i = 0 to list.size
stdev = stdev + (list[i] - mean)^2
stdev = sqrRoot( stdev / list.size )
Solution 1:
I think the easiest way to do this is with an orthogonality trick. I'll show how to incrementally compute the variance instead of the standard deviation. Let $X_1, X_2, ...$ be an iid sequence of random variables with $\bar X = n^{-1} \sum_{j = 1} ^ n X_j$ and $s^2_n$ defined similarly as the $n$'th sample variance (I use a denominator of $n-1$ instead of $n$ in your picture to keep things unbiased for the variance, but you can use the same argument just adding $1$ to all the terms with $n$ in them). First write $$ s^2_n = \frac{\sum_{j = 1} ^ n (X_j - \bar X_n)^2}{n - 1} = \frac{\sum_{j = 1} ^ n (X_j - \bar X_{n - 1} + \bar X_{n - 1} - \bar X_n)^2}{n - 1}. $$ Expand this to get $$ s^2_n = \frac{(n - 2)s^2_{n - 1} + (n - 1) (\bar X_{n - 1} - \bar X_n)^2 + 2 \sum_{j = 1} ^ {n - 1} (X_j - \bar X_{n - 1})(\bar X_{n - 1} - \bar X_n) + (X_n - \bar X_{n})^2}{n - 1} $$ and it is easy to show that the summation term above is equal to $0$ which gives $$ s^2_n = \frac{(n - 2)s^2_{n - 1} + (n - 1)(\bar X_{n - 1} - \bar X_n)^2 + (X_n - \bar X_{n})^2}{n - 1}. $$
EDIT: I assumed you already have an incremental expression for the sample mean. It is much easier to get that: $\bar X_n = n^{-1}[X_n + (n-1)\bar X_{n-1}]$.
Solution 2:
The standard deviation is a function of the totals $T_{\alpha}=\sum_{i=1}^{N}x_{i}^{\alpha}$ for $\alpha=0,1,2$, each of which can be calculated incrementally in an obvious way. In particular, $E[X]=T_{1}/T_{0}$ and $E[X^2]=T_{2}/T_{0}$, and the standard deviation is $$ \sigma = \sqrt{\text{Var}[X]} = \sqrt{E[X^2]-E[X]^2} = \frac{1}{T_0}\sqrt{T_{0}T_{2}-T_{1}^2}. $$ By maintaining totals of higher powers ($T_{\alpha}$ for $\alpha \ge 3$), you can derive similar "incremental" expressions for the skewness, kurtosis, and so on.
Solution 3:
What you refer to as an incremental computation is very close to the computer scientist's notion of an online algorithm. There is in fact a well-known online algorithm for computing the variance and thus square root of a sequence of data, documented here.