what is the variance of a constant matrix times a random vector?

$\newcommand{\Var}{\operatorname{Var}}$In this video is claimed that if the equation of errors in OLS is given by: $$u=y - X\beta$$ Then in the presence of heteroscedasticity the variance of $u$, will not be constant, $\sigma^2 \times I$, where $I$ is an identity matrix, but: $$\Var(u\mid X)=\sigma^2\Omega$$ In order to account for the heteroskedasticity, we can estimate the transform system, such that $P$ is a transformation matrix. $$Py=PX\beta-Pu$$
Where "the variance of a constant matrix $P$ times a random vector $u$" is: $$\Var(Pu\mid X)=P\Var(u\mid X)P'=P(\sigma^2\Omega)P'$$ Can somebody explain me the proof for that?


$$ \operatorname{var}(AX) = A\Big( \operatorname{var}(X) \Big) A^T. $$

  • $X\in\mathbb R^{\ell\times1}$ is a random column vector,
  • $\operatorname{var}(X) = \operatorname{E}((X-\mu)(X-\mu)^T)$, where $\mu=\operatorname{E}(X),$ is an $\ell\times\ell$ constant (i.e. non-random) matrix,
  • $A\in\mathbb R^{k\times\ell}$ is a constant matrix,
  • and so $\operatorname{var}(AX)\in\mathbb R^{k\times k}$ is a constant matrix.

The proof is this: \begin{align} & \operatorname{var}(AX) \\[10pt] = {} & \operatorname{E}\Big((A(X-\mu))(A(X-\mu))^T\Big) \\[10pt] = {} & \operatorname{E}\Big(A(X-\mu)(X-\mu))^T A^T\Big) \\[10pt] = {} & A \operatorname{E}\Big((X-\mu)(X-\mu))^T \Big) A^T \\[10pt] = {} & A \Big( \operatorname{var}(X) \Big) A^T. \end{align}