Intuitive explanation of variance and moment in Probability

While I understand the intuition behind expectation, I don't really understand the meaning of variance and moment.

What is a good way to think of those two terms?


Solution 1:

Variance of a r.v. $X$ is $Var(X) = \mathbb{E}[(X-\mathbb{E}(X))^2]$. Variance is used as a measure for the dispersion of a random variable. If you are interested in how far a random variable is from what you expect; perhaps you would consider $\mathbb{E}(|X-\mathbb{E}(X)|)$ but see: https://stats.stackexchange.com/questions/118/standard-deviation-why-square-the-difference-instead-of-taking-the-absolute-val for why this we use the square instead.

For all except the first moment (i.e. the mean) the central moments (centred about the mean) are far more interesting.

As for the $r$th (central) moment of $X$, that is $\mathbb{E}[(X-\mathbb{E}(X))^r]$. The lower moments ($r<5$) are related to properties of the distribution:

$r=0$ is $1$

$r=1$ is $0$

$r=2$ is the variance which is a measure of the spread from the mean.

$r=3$ is related to skewness which is a measure of the asymmetry of the distribution. See: http://en.wikipedia.org/wiki/Skewness

$r=4$ is related to kurtois which is a measure of the 'peakedness' of a distribution. See: http://en.wikipedia.org/wiki/Kurtosis

For $r\geq5$ they become slightly harder to compute. The $5th$ central moment is a measure for the asymmetry of the distributions tails, but I would use skewness instead.

Solution 2:

There are many good answers possible, depending on how one's intuition works and one's experience with related things like averages, $L^p$ norms, etc. One answer that builds on your intuition of an expectation is to think of the higher central moments as weighted expectations. The central third moment, for example, is defined as the expectation of the third powers of deviations from the mean. Split a third power into two parts: one is the deviation from the mean (let's call it, after common statistical parlance, the "residual") and the other is the squared deviation from the mean (ergo, the squared residual). Treat the latter as a weight. As the residual gets larger in size, these weights get larger (in a quadratic fashion). The third central moment is the weighted "average" of the residuals. (I have to put "average" in quotes because, strictly speaking, to deserve being called an average one would require the weights themselves to average out to 1, but they don't usually do that.) Thus, the third central moment gives disproportionately greater emphasis to larger residuals compared to smaller ones. (It is possible for that emphasis to be so great that the third central moment doesn't even exist: if the probability doesn't decay rapidly enough with the size of the residual, the weights can swamp the probability, with the net effect of creating an infinitely great third central moment.)

You can understand, and even analyze, all other central moments in the same way, even fractional moments and absolute moments. It should be immediately clear, for instance, that the higher the moment, the more the weights are emphasized at extreme values: higher central moments measure (average) behavior of the probability distribution at greater distances from the mean.

Solution 3:

The localization and the dispersion probabilistic measures can be "seen" as the corresponding momentums of mechanical systems of "probabilistic masses".

The expectation has the following mechanical interpretation. Given that $F(x)$ is the "probabilistic mass" contained in the interval $0\le X\lt x$ (in one dimention), the mathematical expectation of the random variable $X$ is the static momentum with respect to the origin of the "probabilistic masses" system.

The variance is the mechanical analog of the inertia momentum of the "probabilistic masses" system with respect to the center of masses.

The variance of the random variable $X$ is the 2nd order momentum of $X-m$ where $m$ is the expectation of $X$, i.e. $m=E(X)=\displaystyle\int_0^{x} x\, dF(x)$ ($F(x)$ is the cumulative distribution function).

The $k$-order momentum is the expectation of $(X-m)^k$.

Solution 4:

The variance is the expected squared deviation between the random variable and its expectation.

It measures how much the random variable scatters or spreads around its expectation.