Scaling the normal distribution?

I might just be slow (or too drunk), but I'm seeing a conflict in the equations for adding two normals and scaling a normal. According to page 2 of this, if $X_1 \sim N(\mu_1,\sigma_1^2)$ and $X_2 \sim N(\mu_2,\sigma_2^2)$, then $X_1 + X_2 \sim N(\mu_1 + \mu_2, \sigma_1^2 + \sigma_2^2)$, and for some $c \in \mathbb{R}$, $cX_1 = N(c\mu_1, c^2\sigma_1^2)$.

Then for $X \sim N(\mu,\sigma^2)$, we have $X + X = N(\mu + \mu,\sigma^2 + \sigma^2) = N(2\mu,2\sigma^2)$, but also $X + X = 2X = N(2\mu,2^2\sigma^2) = N(2\mu,4\sigma^2)$ ? Ie, the variances disagree.

edit: Oh, am I mistaken in saying that $2X = X + X$? Is the former "rolling" $X$ just once and doubling it while the latter "rolls" twice and adds them?


On the first page of the cited document, $X_1$ and $X_2$ were previously defined to be two (distinct) independent, identically distributed random variables. For your purposes, the "identically distributed" part is not important, but the "independent" part is.

On the second page, where $X_1$ and $X_2$ are considered to be normal variables, there's still the assumption that they're independent. Possibly this could have been stated more clearly, but in context this assumption makes sense.

When you consider $2X = X + X$, you are not dealing with two independent variables. The two "copies" of $X$ are correlated (in fact, as correlated as any two variables can be). The formula for the sum of two independent normal variables therefore does not apply.


Expectation is always linear. So for any two variables $X,Y$, we have $E[X+Y]=E[X]+E[Y]$. And $E[\underbrace{X+X+\ldots+X}_{k \text{ times}}]=E[kX]=kE[X]$

Variance is linear when the variables are independent. In this case, $V[X+Y]=V[X]+V[Y]$. However when the variables are the same, i.e. when we scale, we have $V[kX]=k^2 V[X]$.

These are true no matter what the distribution. Determining the distribution of the sum of random variables is, in general, difficult. However when $X$ and $Y$ are independent normal variables, then the sum of $X$ and $Y$ is also normally distributed (and the means and variances add as above).

In my opinion your question is a good one and it is very easy to become confused about what is adding and what is scaling. A very important example which involves adding independent distributions and scaling is when you compute the variance of the mean $\bar{X}$ of independent identically distributed (iid) samples from the same distribution.

To keep things simple we have $n$ samples and let's say each sample is normally distributed with variance $\sigma^2$. So each sample is drawn from $X_i \sim N(\mu,\sigma^2)$. Then adding independent random variables the variance of the sum is $V(S) = V\left(\sum_{i=1}^n X_i\right) = \sum_{i=1}^n V\left(X_i\right) = n\sigma^2$. But the mean $\bar{X}=S/n$ and so by scaling $V(\bar{X}) = V(S/n) = n\sigma^2 / n^2 = \sigma^2/n$. It is this combination of adding and scaling which leads to the famous relationship that standard deviation of the sum increases according to the square root of $n$, and of the mean as $1/\sqrt{n}$.


Suppose $X_1, X_2,..., X_k$ are any k random variables (not necessarily independent).

Then variance of $X_1+X_2+...+X_n$ is calculated by:

$Var(X_1+X_2+...+X_n)=\sum_{i=1}^kVar(X_i)+2*\sum_{1\le i\le k}Cov(X_i,X_j)$

When $X_1, X_2,...X_k$ are independent, then $Cov(X_j, X_j)=0$

$\therefore$ $Var(X_1+X_2+...+X_n)=\sum_{i=1}^kVar(X_i)$

When you have $X, X,...,X$, i.e $kX$ variables, $X$ and $X$ are no more independent then $Cov(X_i,X_j)$ will not be zero.

For example, you want to calculate $Var(X+X+X)$ it is not $3*Var(X)$ it will be $9*Var(X)$

Since $Var(X+X+X)=\sum_{i=1}^3Var(X)+2*\sum_{1 \le i \le j} Cov(X,X) =3*Var(X)+6*Cov(X,X)$ and $Cov(X,X)=Var(X,X)$

Therefore, $Var(X+X+X)=9*Var(X)$

The key is "When you consider 2X=X+X, you are not dealing with two independent variables" as David K points out.