Determining variance from sum of two random correlated variables

I understand that the variance of the sum of two independent normally distributed random variables is the sum of the variances, but how does this change when the two random variables are correlated?

For any two random variables: $$\text{Var}(X+Y) =\text{Var}(X)+\text{Var}(Y)+2\text{Cov}(X,Y).$$ If the variables are uncorrelated (that is, $\text{Cov}(X,Y)=0$), then

$$\tag{1}\text{Var}(X+Y) =\text{Var}(X)+\text{Var}(Y).$$ In particular, if $X$ and $Y$ are independent, then equation $(1)$ holds.

In general $$ \text{Var}\Bigl(\,\sum_{i=1}^n X_i\,\Bigr)= \sum_{i=1}^n\text{Var}( X_i)+ 2\sum_{i< j} \text{Cov}(X_i,X_j). $$ If for each $i\ne j$, $X_i$ and $X_j$ are uncorrelated, in particular if the $X_i$ are pairwise independent (that is, $X_i$ and $X_j$ are independent whenever $i\ne j$), then $$ \text{Var}\Bigl(\,\sum_{i=1}^n X_i\,\Bigr)= \sum_{i=1}^n\text{Var}( X_i) . $$

You can also think in vector form:

$$\text{Var}(a^T X) = a^T \text{Var}(X) a$$

where $a$ could be a vector or a matrix, $X = (X_1, X_2, \dots, X_n)^T$ is a vector of random variables. $\text{Var}(X)$ is the covariance matrix.

If $a = (1, 1, \dots, 1)^T$, then $a^T X$ is the sum of all the $x_i's$.

Let's work this out from the definitions. Let's say we have 2 random variables $x$ and $y$ with means $\mu_x$ and $\mu_y$. Then variances of $x$ and $y$ would be:

$${\sigma_x}^2 = \frac{\sum_i(\mu_x-x_i)(\mu_x-x_i)}{N}$$ $${\sigma_y}^2 = \frac{\sum_i(\mu_y-y_i)(\mu_y-y_i)}{N}$$

Covariance of $x$ and $y$ is:

$${\sigma_{xy}} = \frac{\sum_i(\mu_x-x_i)(\mu_y-y_i)}{N}$$

Now, let us consider the weighted sum $p$ of $x$ and $y$:

$$\mu_p = w_x\mu_x + w_y\mu_y$$

$${\sigma_p}^2 = \frac{\sum_i(\mu_p-p_i)^2}{N} = \frac{\sum_i(w_x\mu_x + w_y\mu_y - w_xx_i - w_yy_i)^2}{N} = \frac{\sum_i(w_x(\mu_x - x_i) + w_y(\mu_y - y_i))^2}{N} = \frac{\sum_i(w^2_x(\mu_x - x_i)^2 + w^2_y(\mu_y - y_i)^2 + 2w_xw_y(\mu_x - x_i)(\mu_y - y_i))}{N} \\ = w^2_x\frac{\sum_i(\mu_x-x_i)^2}{N} + w^2_y\frac{\sum_i(\mu_y-y_i)^2}{N} + 2w_xw_y\frac{\sum_i(\mu_x-x_i)(\mu_y-y_i)}{N} \\ = w^2_x\sigma^2_x + w^2_y\sigma^2_y + 2w_xw_y\sigma_{xy}$$

Consider a function of two variables, $ z = f(x, y) $. Then the variation of z, $\delta z$, is $$\tag{1} \delta z = \frac{df}{dx} \ \delta x $$ where $$ \frac{df}{dx} = \frac{\partial f}{\partial x} + \frac{\partial f}{\partial y} \frac{ dy}{dx}. $$ Squaring equation (1) we get $$ (\delta z)^2 = \Big[ \left( \frac{\partial f}{\partial x} \right)^2 + 2 \frac{\partial f}{\partial x} \frac{\partial f}{\partial y} \frac{dy}{dx} + \left( \frac{\partial f}{\partial y}\right)^2 \left( \frac{dy}{dx} \right)^2 \Big] (\delta x)^2. $$ Multiplying this out we get $$ (\delta z)^2 = \left( \frac{\partial f}{\partial x} \right)^2 (\delta x)^2+ 2 \frac{\partial f}{\partial x} \frac{\partial f}{\partial y} \delta x \delta y + \left( \frac{\partial f}{\partial y}\right)^2 (\delta y)^2, $$ where we have used that $\delta y = \frac{dy}{dx} \delta x$. Now we can identify the quadratic variation terms with the variances and covariance of random variables: $$ \text{Var}(z) = \left( \frac{\partial f}{\partial x} \right)^2 \text{Var}(x) + 2 \frac{\partial f}{\partial x} \frac{\partial f}{\partial y} \text{Cov}(x,y) + \left( \frac{\partial f}{\partial y}\right)^2 \text{Var}(y). $$ When the function $f$ is just a sum of $x$ and $y$ then the partial derivative terms are all equal to one, giving $$\text{Var}(z) = \text{Var}(x) + 2\ \text{Cov}(x,y) + \text{Var}(y). $$

Determining variance from sum of two random correlated variables

Related

Recent Posts