$X$ normally distributed, then $X^T \Sigma ^{-1} X$ follows chi square distribution.

Suppose $X\sim \mathcal{N} _p (0, \Sigma )$. I am not sure why $X^{\top} \Sigma ^{-1} X$ follows a $\chi ^2$-distribution with $p$ degrees of freedom.

I think it has something to do with the square root of $\Sigma ^{-1}$, since $$\Sigma ^{-1/2} X \sim \mathcal{N} _p (0, \Sigma ^{-1/2} \Sigma {\Sigma ^{-1/2} }^{\top}) = \mathcal{N} _p (0, \Sigma ^{-1/2} \Sigma ^{1/2} \Sigma ^{1/2} \Sigma ^{-1/2} ) = \mathcal{N} _p (0, I_p),$$ but how do I know that $\Sigma$ and $\Sigma ^{-1}$ have square roots and how do I know that $\Sigma ^{-1/2}$ is symmetric?


Solution 1:

Suppose $X\sim \mathcal{N} _p (0, \Sigma )$ with $\Sigma$ an invertible covariance matrix.

To prove: $X^{\top} \Sigma ^{-1} X$ follows a $\chi ^2$-distribution with $p$ degrees of freedom.

Proof: Since $\Sigma$ is a symmetric real matrix, there are eigenvalues $\lambda _1, \lambda _2, \dots, \lambda _p$ and corresponding orthonormal eigenvectors $u_1, u_2, \dots, u_p$ such that $$\Sigma = Q \Lambda Q^{\top}$$ with $Q$ the $p\times p$ matrix with the orthonormal eigenvectors as columns, and $\Lambda$ the diagonal matrix with the eigenvalues on the diagonal. Note that $$Q^{\top} Q = I_p$$ since $u_1, u_2, \dots, u_p$ are orthonormal. The eigenvalues of $\Sigma$ are non-zero because $\Sigma$ is of full rank. Note that, $$\text{var}(Q^{\top}X) = Q^{\top} \Sigma Q = Q^{\top} Q\Lambda Q ^{\top} Q = \Lambda,$$ so the eigenvalues $\lambda _1, \lambda _2, \dots, \lambda _p$ are positive. This also shows that $\Sigma$ is positive definite.

Let $$\Lambda ^{1/2} = \left [\lambda _i ^{1/2} \delta _{ij} \right ] _{1\leq i,j \leq p} \ \ \text{ and } \ \ \Sigma ^{1/2} = Q \Lambda ^{1/2} Q^{\top}.$$ We immediately see that $\Sigma ^{1/2}$ is a symmetric matrix. The matrix $\Sigma ^{1/2}$ is positive definite, since $\Lambda ^{1/2}$ is positive definite, $$v^{\top} \Sigma ^{1/2} v= v^{\top} Q \Lambda ^{1/2} Q ^{\top} v = (Q^{\top} v)^{\top} \Lambda ^{1/2} (Q ^{\top} v) \geq 0 \ \ \ \ \text{ for all $p\times 1$ vectors $v$,}$$ where the inequality is strict for $v\neq 0$. We have, $$ \Sigma ^{1/2} \Sigma ^{1/2} = Q \Lambda ^{1/2} Q^{\top} Q \Lambda ^{1/2} Q^{\top} = Q \Lambda ^{1/2} \Lambda ^{1/2} Q^{\top} = Q \Lambda Q^{\top} = \Sigma . $$ Thus, $\Sigma ^{1/2}$ is a symmetric positive-definite square root of $\Sigma$.

Since matrix inverses are unique and since $\Sigma$ is symmetric, it is easy to show that $(\Sigma ^{-1} ) ^{\top} = \Sigma ^{-1} $. The inverse $\Sigma ^{-1}$ has the same eigenvectors as $\Sigma$ and the eigenvalues of $\Sigma ^{-1}$ are the reciprocals of the eigenvalues of $\Sigma$. Since these reciprocals are positive, $\Sigma ^{-1}$ is positive definite. Let $$\Lambda ^{-1} = \left [\delta _{ij} / \lambda _i \right ] _{1\leq i, j \leq p}, \ \Lambda ^{-1/2} = \left [\delta _{ij} / \left (\lambda _i ^{1/2} \right ) \right ] _{1\leq i,j \leq p} \ \ \text{ and } \ \ \Sigma ^{-1/2} = Q \Lambda ^{-1/2} Q^{\top}.$$ The matrix $\Sigma ^{-1/2}$ is obviously symmetric and it is positive definite since $\Lambda ^{-1/2}$ is positive definite.

Now, we have $$ \Sigma ^{-1/2} \Sigma ^{1/2} = Q \Lambda ^{-1/2} Q^{\top} Q \Lambda ^{-1/2} Q^{\top} = Q \Lambda ^{-1} Q ^{\top} = \Sigma ^{-1}.$$ Thus, $\Sigma ^{-1/2}$ is a symmetric positive-definite square root of $\Sigma ^{-1}$.

It follows that \begin{align*} \Sigma ^{-1/2} X \sim \mathcal{N} _p (0, \Sigma ^{-1/2} \Sigma \Sigma ^{-1/2} ) & = \mathcal{N} _p (0, Q \Lambda ^{-1/2} Q^{\top} Q \Lambda Q^{\top}Q \Lambda ^{-1/2} Q^{\top} ) \\ & = \mathcal{N} _p (0, Q \Lambda ^{-1/2} \Lambda \Lambda ^{-1/2} Q^{\top} ) \\ %& = \mathcal{N} _p (0, Q \Lambda ^{-1/2} \Lambda ^{1/2} \Lambda ^{1/2} \Lambda ^{-1/2} Q^{\top} ) \\ & = \mathcal{N} _p (0, Q Q^{\top} ) \\ & = \mathcal{N} _p (0, I_p ) \end{align*} where we used that $Q^{\top} Q = I_p$.

Now, we have $$X^{\top} \Sigma ^{-1} X = (\Sigma ^{-1/2} X) ^{\top} (\Sigma ^{-1/2} X) \sim \chi ^2 _p .$$

Solution 2:

In the case where $\Sigma$ is singular, the number of degrees of freedom in the chi-square distribution is smaller than $p;$ in any case it's the rank of $\Sigma.$

You have $\Sigma=\operatorname E((X-\mu)(X-\mu)^\top) = \operatorname E(XX^\top)$ where $\mu=0$ is the $p\times 1$ column vector $\operatorname E(X).$

From that it is obvious that $\Sigma$ is symmetric. It is easy to show $\Sigma$ is positive definite: $$ a^\top \Sigma a = \operatorname{var}(a^\top X) \ge 0 $$ if $a$ is any $p\times 1$ constant (i.e. non-random) vector. The random variable $a^\top X$ is scalar-valued so its variance is a non-negative scalar, strictly positive if $a\ne0$ (and that last uses the assumption that $\Sigma$ is non-singular).

A theorem of linear algebra says that since $\Sigma$ is symmetric and all of its entries are real, there is some orthogonal matrix $G$ (i.e. a matrix $G$ for which $G^\top G = GG^\top = I_p$) and some diagonal matrix $\Lambda$ such that $\Sigma = G\Lambda G^\top.$

The diagonal entries in $\Lambda$ must be positive since they are variances of components of $G^\top X.$

So now replace the positive numbers that are the diagonal entries of $\Lambda$ with their square roots and call that $\Lambda^{1/2}$ and try to show that $G\Lambda G^\top$ is a symmetric positive-definite square root of $\Sigma.$

Solution 3:

Since $\Sigma$ is a covariance matrix, it's symmetric with positive eigenvalues.