Standard Deviation. Why do we take the square root of the entire equation?

Please forgive my lack of maths knowledge,

It is my understanding that:

Standard Deviation is the average distance from the mean in a data set of numbers.

Therefore it stands to reason that working out the standard deviation of the data set $x_i = \{1,2,3,4,5\}$ would involve the following.

First working out the mean $\mu(x_i) = 3$ and Then working out the sum of the distance from the average $\sum{|x_i-\mu|} = 6$ then we do $\frac{\sum{|x_i-\mu|}}{N} = \frac{6}{5} = 1.2$

This means that, according to my method/thinking, 1.2 is the standard deviation.

However when using the formula $\sqrt{\frac{\sum{(x_i-\mu)}^2}{N}}$ I get $1.414$

Can someone explain why I'm wrong in layman terms. Thankyou


Solution 1:

Standard deviation is not the average distance from the mean, as your example shows.

The reason for using standard deviation rather than mean absolute deviation is that the variance of $\{x_i\}_{i=1}^m$ plus the variance of $\{y_j\}_{j=1}^m$ is the variance of $\{x_i+y_j\}_{i=1,\,j=1}^{n,\,m}$ (but only if you define variance in the way that puts $n$ and $m$ rather than the Bessel-corrected $n-1$ and $m-1$ in the denominators). This makes it possible, for example, to apply the central limit theorem to find the probability that when you toss $1800$ coins, the number of heads is between $890$ and $920$. You can find the standard deviation of the number of heads because of the additivity of variances.

Standard deviation and mean absolute deviation have in common that they are both translation invariant and both scale-equivariant for non-negative changes in scale, so either can be used as a measure of dispersion.