Conditional and Total Variance

A rigorous proof is here; it relies on the law of total expectation, which says that $E(E(X|Y))=E(X)$. The intuitive explanation of that is that $E(X|Y)$ is the expected value of $X$ given a particular value of $Y$, and that $E(E(X|Y))$ is the expected value of that over all values of $Y$. So $Y$ no longer matters, and we're just looking at $E(X)$.

The variance law is a bit more difficult to parse, but this is what it says to me. "How much does $Y$ vary? We expect it to vary by the average value of the variances we get by fixing $X$. But even when we fix $X$, there is some swing in $Y$, and thus swing in $E(Y|X)$. So we add on the variance of $E(Y|X)$. The first term is the expected variance from the mean of $Y|X$; the second is the variance of that mean."


Geometrically it's just the Pythagorean theorem. We may measure the "length" of random variables by standard deviation.

We start with a random variable Y. E(Y|X) is the projection of this Y to the set of random variables wich may be expressed as a deterministic function of X.

We have a hypotenuse Y with squared length Var(Y).

The first leg is E(Y|X) with squared length Var(E(Y|X)).

The second leg is Y-E(Y|X) with squared length Var(Y-E(Y|X))=...=E(Var(Y|X)).