Intuitive meaning of Pearson Product-moment correlation coefficient Formula

We always have $|\rho| \le 1$. If the joint distribution of $X$ and $Y$ is concentrated on a straight line with positive slope, i.e. $Y = a X + b$ for some constants with $a > 0$, then $\rho = 1$; if it is concentrated on a straight line with negative slope, $\rho = -1$; if $X$ and $Y$ are independent $\rho = 0$. In other cases $\rho$ tells you, in a sense, how close the distribution is to these cases.

Note also that $\rho$ is unaffected by changes of units (scaling), i.e. multiplying $X$ or $Y$ by positive constants, or translation.


First the denominator can be removed, if the values in $X$ and $Y$ are standardized, call them $Z_x = X/S_x$ and $Z_y = Y/S_y$ , such that the standard-deviations of $S_{Z_x}=1$ and $S_{Z_y}=1$.

Then the correlation is simply the sum of the products of the individual values divided by n $$ \rho = \sum_{k=1}^n Z_{x,k}*Z_{y,k} / n $$ or the average of something like the common excess from the mean where we understand the $Z_{x,k}$ and $ Z_{y,k}$ as such excesses.

I like the model of $\rho$ as the cosine of an angle between an $X$-vector and $Y$-vector in the multidimensional euclidean space with origin at zero and head at the coordinate of the $Z_x$-values resp of the $Z_y$-values , where each observation (the k'th case) defines another dimension/axis. (That also indicates, why the individual observations/measures should be (conceptionally) independent of each other so that the axes in that n-dimensional space are rectangular to each other). Then it is also immediately obvious, that there is a rotation of the two vectors (as fixed wire-model) in this space such that we need only two dimensions, because two vectors from the origin define a plane only (Immediately see the generalization to more vectors)