How variance is defined?

The variance of a random variable $X$ is defined as $E[(x-\mu )^2]$. Why can't it be defined as $E[|x-\mu |]$. i.e., What is the basic idea behind this definition. Thank you.


Intuitively, there is no big difference in how to define a variation from the mean. However, taking a square of distance has several nice features.

Here is an example of a statistical point of view. I would also emphasize the measure-theoretical advantages of the classical variance which is an expectation of the squared difference.

  1. It has an argmin at the expectation, i.e. if $$ \mathsf E[(\xi-x)^2]\to\min_x $$ then $x^*:=\mathsf E\xi$ is a solution,

  2. It introduces an $L^2$ norm so that you obtain a Hilbert space, which more handful properties than the Banach space $L^1$. For example, it is consistent with an inner product $$ \langle\xi,\eta \rangle = \mathrm{Cov}(\xi,\eta) $$


$\newcommand{\var}{\operatorname{var}}$

This seems to be a recurring question.

Maybe the most important part of the answer is that with the conventional definition of variance, if $X_1,\ldots,X_n$ are independent random variables, then $$ \var(X_1+\cdots+X_n) = \var(X_1)+\cdots+\var(X_n). $$

This makes it possible to know what the variance is of the number of heads that result from tossing a coin $20000$ times (or of any other sum of i.i.d. random variables), and thus to use the central limit theorem to find things like the probability that the number of heads is between $10,000\pm60$.


The variance of a variable is defined using the $L^2$-norm which is much nicer than the $L^1$-norm as it comes from a scalar product.


$E[(x-\mu )^2]$ is preferred than $E[|x-\mu |]$ because the first one has nice mathematical property which is helpful in further higher statistical study.

Again some statistician don't like measure of dispersion which depends on measure of central tendency(Here $\mu$). So they use Gini's coefficient of concentration.

$$G=\frac{1}{n^2}\sum_{i=1}^{n}\sum_{j=1}^{n}(x_i-x_j)^2=2.\frac{1}{n}\sum_{i=1}^{n}[x_i- \bar x]^2 $$ Here variance is a function of mutual differences. This is from descriptive statistics. For probabilistic approach $\sum$ can be replaced by $E$.

So, this is the reason why we prefer $E[(x-\mu )^2]$.

That's all from me.