How can I simply prove that the pearson correlation coefficient is between -1 and 1?

Solution 1:

First of all Pearson's correlation coefficient is bounded between -1 and 1, not 0 and one. It's absolute value is bounded between 0 and 1, and that useful later.

Pearson's correlation coefficient is simply this ratio:

$$\rho = \frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}$$

Both of the variances are non-negative by definition, so the denominator is $\ge 0$. The only way a singularity can occur is if one of the variables has 0 variance.

If two random variables are perfectly uncorrelated, (i.e. independent) then their covariance is 0. So 0 is a valid lower bound for the absolute value of the expression.

This can be shown like so:

$$Cov(X,Y) = E[(X-\bar{X})(Y-\bar{Y})] = E[XY] - E[X]E[Y]$$

if two random variables are independent, then $E[XY]=E[X]E[Y]$, and

$$Cov(X,Y) = E[XY] - E[X]E[Y] = E[X]E[Y] - E[X]E[Y] = 0.$$

Now for the upper bound. Here we apply the Cauchy-Schwarz inequality.

$$|Cov(X,Y)|^2 \le Var(X)Var(Y)$$

$$\therefore |Cov(X,Y)| \le \sqrt{Var(X)Var(Y)}$$

plug this result from the Cauchy-Schwarz inequality into the formula for $\rho$, and we get:

$$|\rho| = \left|\frac{Cov(X,Y)}{\sqrt{Var(X)Var(Y)}}\right| \le \frac{\sqrt{Var(X)Var(Y)}}{\sqrt{Var(X)Var(Y)}} = 1$$

Thus we have the absolute value of the correlation is bounded below by 0 and above by 1.

Solution 2:

Saw this while looking around. To complete above proof, $$ 0\leq Var(\frac{X}{\sigma_x}\pm\frac{Y}{\sigma_y}) = Var(\frac{X}{\sigma_x}) + Var(\frac{Y}{\sigma_y}) \pm 2Cov(\frac{X}{\sigma_x},\frac{Y}{\sigma_y}) $$

where (for both $X$ and $Y$) due to $Var(aX)=a^2Var(X)$ $$Var(\frac{X}{\sigma_x}) = \frac{1}{\sigma_x^2}Var(X)=\frac{\sigma_x^2}{\sigma_x^2}=1$$

and due to $Cov(aX,bY)=abCov(X,Y)$ $$Cov(\frac{X}{\sigma_x},\frac{Y}{\sigma_y}) = \frac{1}{\sigma_x\sigma_y}Cov(X,Y) = Corr(X,Y)$$

Hence $$ Var(\frac{X}{\sigma_x}\pm\frac{Y}{\sigma_y}) = 1+1\pm2Corr(X,Y)$$

Solution 3:

Here's a standalone proof that should be much easier to understand and justify than quoting Cauchy-Schwarz. It's taken from sec 7.4 of Sheldon Ross's A First Course in Probability 10th edition, which I highly recommend.

Let $X$ and $Y$ be random variables with respective variances $\mathrm{Var}(X) = \sigma_x^2$ and $\mathrm{Var}(Y) = \sigma_y^2$. We then have

$$ 0 \leq \mathrm{Var} \left( \frac{X}{\sigma_x} \pm \frac{Y}{\sigma_y} \right) = 2 \pm 2 \mathrm{Corr} (X,Y). $$

From which it immediately follows that $-1 \leq \mathrm{Corr} (X,Y) \leq 1$.