The sum of $n$ independent normal random variables.

How can I prove that the sum of $X_1, X_2, \ldots,X_n$ random variables, all of which have normal distributions $N(\mu_i, \sigma_i)$, is a random variable that is itself normally distributed with mean $$\mu =\sum_{i=1}^n \mu_i$$

and variance

$$\sigma^2 = \sum_{i=1}^n \sigma_i^2$$

Edit: I forgot to add that this was with the assumption that all $X_1, X_2,\ldots,X_n$ are independent.

Solution 1:

I figured this out on the transit ride to work this morning. I used the moment-generating function, which may be the same thing as the "Probability-generating function" that is recommended in the answer above? I'm not sure. Anyway...

If we let $Y = X_{1} + X_{2} + \space \cdots \space + X_{n}$ , then the moment-generating function of $Y$ is given by:

$$ M_{Y}(t) = E[e^{t(X_{1} \space + \space X_{2} \space + \space \cdots \space + \space X_{n})}] = \prod_{i=1}^{n}E[e^{tX_{i}}] = \prod_{i=1}^{n}M_{X_{i}}(t) $$

If we then find

$$M_{X_{i}}(t) = \exp[\mu_{i}t + \frac{\sigma_{i}^2t^2}{2}]$$

then we have

$$ M_{Y}(t) = \prod_{i=1}^{n}\exp[\mu_{i}t + \frac{\sigma_{i}^2t^2}{2}] $$

then by properties of exponents we have

$$ M_{Y}(t) = \prod_{i=1}^{n}\exp[\mu_{i}t + \frac{\sigma_{i}^2t^2}{2}] = \prod_{i=1}^{n}\exp[\mu_{i}t]\exp[\frac{\sigma_{i}^2t^2}{2}] = \exp[t(\sum_{i=1}^{n}\mu_{i}) + \frac{t^2}{2}(\sum_{i=1}^{n}\sigma_{i}^2)] $$

Because

$$ M_{Y}(t) = \exp[t(\sum_{i=1}^{n}\mu_{i}) + \frac{t^2}{2}(\sum_{i=1}^{n}\sigma_{i}^2)] $$ this implies that Y is normally distributed with mean $\space\sum_{i=1}^{n}\mu_{i}$ and variance $ \space \sum_{i=1}^{n}\sigma_{i}^2\space$

Solution 2:

Use Probability-generating functions.

Solution 3:

The result is false: consider $n=2$, $\mu_1=0\ne\sigma_1^2$ and $X_2=SX_1$ where $S=\pm1$ is Bernoulli centered and independent of $X_1$. Then $X_1$ and $X_2$ are normal $(0,\sigma_1^2)$ but $X_1+X_2$ is not normal $(0,2\sigma_1^2)$ since $X_1+X_2$ is not normal.

Solution 4:

There are many proofs by mathematical induction that take the following form:

The case $n=1$ is vacuously true.
Case $n+1$ follows easily from case $n$ if $n\ge 2$, but that step is impossible when $n=1$, and the proof of the induction step uses both case $n$ and case $2$.
The substantial part of the proof is case $2$.

This is one of those. Prove it for a sum of just two random variables and the rest is easy.

Suppose $X_i\sim N(\mu_i,\sigma_i^2)$ for $i=1,2$, and these are independent.

If you know that the density of $X_1+X_2$ is the convolution of the two separate densities, then just evaluate the integral: \begin{align} & f_{X_1+X_2}(x) \\[10pt] & =\int_{-\infty}^\infty f_{X_1}(w)f_{X_2}(x-w)\, dw \\[10pt] & = \text{constant}\cdot \int_{-\infty}^\infty \exp\left(-\frac12\left(\frac{w-\mu_1}{\sigma_1}\right)^2\right)\exp\left(-\frac12\left(\frac{x-w-\mu_2}{\sigma_2}\right)^2\right) \, dw \end{align}

The product of the two "exp"s is the "exp" of \begin{align} & -\frac12\cdot\frac{(\sigma_1^2+\sigma_2^2)w^2 - 2w(\sigma_2^2\mu_1+\sigma_1^2(x-\mu_2))+ \sigma_2^2\mu_1^2+\sigma_1^2(x-\mu_2)^2}{\sigma_1\sigma_2} \\[12pt] & = -\frac12 \cdot \frac{w^2 - (\text{a certain weighted average of $\mu_1$ and $x-\mu_2$}) + \cdots}{\frac{\sigma_1\sigma_2}{\sigma_1^2+\sigma_2^2}} \end{align}

At this point the algebra gets moderately messy, but you complete the square and get $$ \text{constant}\cdot\int_{-\infty}^\infty \exp(\text{expression}_1)\cdot\exp(\text{expression}_2) \, dw $$

Now $\text{expression}_2$ should not depend on $w$, so you can pull a factor out: $$ \text{constant}\cdot\exp(\text{expression}_2)\cdot\int_{-\infty}^\infty \exp(\text{expression}_1)\,dw. $$ Then $\text{expression}_1$ will be $\text{constant}\cdot(w-\bullet)^2$, where "$\bullet$" is something that depends on $x$, so it may look as if the value of this integral depends on $x$. But it doesn't! Just let $u=w-\bullet$ so that $du=dw$, and "$\bullet$" is gone. Thus as a function of $x$, this integral is a constant. The whole thing reduces to a constant times $\exp(\text{expression}_2)$. And $\text{expression}_2$ ends up being a quadratic polynomial in $x$, with a negative leading coefficient, so we've got a normal (or "Gaussian") density function.

Solution 5:

For a geometric proof, see Bennett Eisenberg & Rosemary Sullivan, Why Is the Sum of Independent Normal Random Variables Normal?, Mathematics Magazine, Dec. 2008, 362-366, available at http://www.maa.org/programs/faculty-and-departments/classroom-capsules-and-notes/why-is-the-sum-of-independent-normal-random-variables-normal