Calculation of the Covariance of Gaussian Mixtures

$\newcommand{\var}{\operatorname{var}}$ You can write $x = y + \text{error}$, where $y = \mu_i$ with probability $\alpha_i$, for $i=1,\ldots,M$, and the conditional probability distribution of the "error" given $y$ is $N(0,C_i)$. Then we have $$ E(x) = E(E(x\mid y)) = E\left.\begin{cases} \vdots \\ \mu_i & \text{with probability }\alpha_i \\ \vdots \end{cases}\right\} = \sum_{i=1}^M\alpha_i\mu_i, $$ and $$ \begin{align} \var(x) = {} & E(\var(x\mid y)) + \var(E(x \mid y)) \\[12pt] = {} & E\left.\begin{cases} \vdots \\ C_i & \text{with probability }\alpha_i \\ \vdots \end{cases}\right\} \\ & {} + \var\left.\begin{cases} \vdots \\ \mu_i & \text{with probability }\alpha_i \\ \vdots \end{cases} \right\} \\[12pt] = {} & \sum_{i=1}^M \alpha_i C_i + \sum_{i=1}^M \alpha_i(\mu_i-\bar\mu)(\mu_i-\bar\mu)^T, \end{align} $$ where $\bar\mu=\sum_{i=1}^M \alpha_i\mu_i$.


A mixture model commonly refers to a weighted sum of densities, not a weighted sum of random variables as in Sasha's answer As a simplest example, a (scalar) random variable $Z$ is said to have a mixture Gaussian density if its probability density function is $$f_Z(z)=\alpha f_X(z)+(1−\alpha)f_Y(z), ~0<\alpha<1$$ where $X$ and $Y$ are Gaussian random variables with different densities, that is, $(\mu_X,\sigma_X^2)\neq(\mu_Y,\sigma_Y^2)$. It follows straightforwardly that $$\begin{aligned} E[Z]&= \int_{-\infty}^\infty zf_Z(z)\,\mathrm dz& &= \alpha\mu_X + (1-\alpha)\mu_Y\\ E[Z^2]&= \int_{-\infty}^\infty z^2f_Z(z)\,\mathrm dz& &= \alpha(\sigma_X^2+\mu_X^2) + (1-\alpha)(\sigma_Y^2 + \mu_Y^2)\\ \text{var}(Z) &= E[Z^2] - (E[Z])^2& &= \alpha(\sigma_X^2+\mu_X^2) + (1-\alpha)(\sigma_Y^2 + \mu_Y^2) - [\alpha\mu_X + (1-\alpha)\mu_Y]^2 \end{aligned}$$ which unfortunately does not simplify to a pretty formula.

More generally, a mixture density would have $n, n > 1,$ terms with positive weights $\alpha_1, \alpha_2, \ldots, \alpha_n$ summing to $1$. It is simplest to think of a partition of the sample space into events $A_k, 1 \leq k \leq n$, with $P(A_k) = \alpha_k$. Then, $Z$ is a random variable whose conditional distribution given $A_k$ is a Gaussian distribution $f_k(z) \sim N(\mu_k,\sigma_k^2)$, and thus the unconditional distribution is, via the law of total probability, $$f(z) = \sum_{k=1}^n \alpha_k f_k(z).$$ The ungainly expression $\alpha(\sigma_X^2+\mu_X^2) + (1-\alpha)(\sigma_Y^2 + \mu_Y^2) - [\alpha\mu_X + (1-\alpha)\mu_Y]^2$ derived earlier for the variance of $Z$ can thus be manipulated into $$\bigr[\alpha\sigma_X^2 + (1-\alpha)\sigma_Y^2\bigr] + \biggr(\bigr[\alpha\mu_X^2 + (1-\alpha)\mu_Y^2\bigr] - \bigr[\alpha\mu_X + (1-\alpha)\mu_Y\bigr]^2\biggr)$$ which, while still not pretty, can be identified as an illustration of the conditional variance formula:

The variance of $Z$ is the mean of the conditional variance plus the variance of the conditional mean.

When $Z$ is a vector random variable whose conditional distributions are jointly Gaussian with mean vector $\mu$ and covariance matrix $C_i$, similar calculations can be done, and the unconditional covariance matrix computed. Suppose that the conditional density of $Z$ given $A_k$ is a jointly Gaussian density with mean vector $\mu^{(k)}$ and covariance matrix $C^{(k)}$. Then, $$E[Z_i] = \sum_{k=1}^n \alpha_k \mu_i^{(k)} ~\text{and}~ E[Z_iZ_j] = \sum_{k=1}^n \alpha_k \bigr(C_{i,j}^{(k)} + \mu_i^{(k)}\mu_j^{(k)}\bigr)$$ giving $$\begin{align}C_{i,j} = \text{cov}(Z_i,Z_j) &= \sum_{k=1}^n \alpha_k \bigr(C_{i,j}^{(k)} + \mu_i^{(k)}\mu_j^{(k)}\bigr) - \left(\sum_{k=1}^n \alpha_k \mu_i^{(k)}\right) \left(\sum_{k=1}^n \alpha_k \mu_j^{(k)}\right)\\ &= \left(\sum_{k=1}^n \alpha_k C_{i,j}^{(k)}\right) + \left(\sum_{k=1}^n \alpha_k \mu_i^{(k)}\mu_j^{(k)} - \left(\sum_{k=1}^n \alpha_k \mu_i^{(k)}\right) \left(\sum_{k=1}^n \alpha_k \mu_j^{(k)}\right)\right) \end{align}$$ which is an illustration of the conditional covariance formula:

The covariance of two random variables is the mean of the conditional covariances plus the covariance of the conditional means.