What does the value of a probability density function (PDF) at some x indicate?

I understand that the probability mass function of a discrete random-variable X is $y=g(x)$. This means $P(X=x_0) = g(x_0)$.

Now, a probability density function of of a continuous random variable X is $y=f(x)$. Wikipedia defines this function $y$ to mean

In probability theory, a probability density function (pdf), or density of a continuous random variable, is a function that describes the relative likelihood for this random variable to take on a given value.

I am confused about the meaning of 'relative likelihood' because it certainly does not mean probability! The probability $P(X<x_0)$ is given by some integral of the pdf.

So what does $f(x_0)$ indicate? It gives a real number, but isn't the relative likelihood of a specific value for a CRV always zero?


Solution 1:

'Relative likelihood' is indeed misleading. Look at it as a limit instead: $$ f(x)=\lim_{h \to 0}\frac{F(x+h)-F(x)}{h} $$ where $F(x) = P(X \leq x)$

Solution 2:

I am not sure if Jester is still interested, as it's been 5 years, but I think I found a less confusing anwer than in Wikipedia.

In contrast to discrete random variables, if X is continuous, f(X) is a function whose value at any given sample is not the probability but rather it indicates the likelihood that X will be in that sample/interval. For example if the value of the PDF around a point (can be generalized for a sample) x is large, that means the random variable X is more likely to take values close to x. If, on the other hand, f(x)=0 in some interval, then X won't be in that interval

Of course a more practical way of thinking it is that the probability of X being in an interval is given by the integral of the PDF.

You might want to look at the link below for more details: http://mathinsight.org/probability_density_function_idea

Solution 3:

In general, if $X$ is a random variable with values of a measure space $(A,\mathcal A,\mu)$ and with pdf $f:A\to [0,1]$, then for all measurable set $S\in\mathcal A$, $$P(X\in S) = \int_S fd\mu $$ So, if $A=\Bbb R$ (and $\mu=\lambda$), then $$P(a<X<b)=\int_a^b f(x)dx$$ So, $f(x) = \displaystyle\lim_{t\to 0} \frac1{2t}\int_{x-t}^{x+t} f =\lim_{t\to 0} \frac1{2t} P(|X-x|<t) $ for example.. We can call it 'relative likelihood'..

Solution 4:

Intro statistics focuses on the PDF as the description of the population, but in fact it is the CDF (cumulative density function) that gives you a functional understanding of the population, as points on the CDF denote probabilities over a relevant range of measures. If you look at all stats from this perspective, then the PDF is just the description of probability change with respect to a change around a point along the measure at hand. The values on the PDF therefore only give you a look at the spread. For example, given two normal distributions $N(\mu_1, \sigma_1^2)$ and $N(\mu_2, \sigma_2^2)$, if you choose any value of $x$ to get point $p_n=\mu_n+x\cdot\sigma_n$ for the respective distributions and get $X_1[p_1 ] > X_2[p_2 ]$, then this just means $\sigma_1 < \sigma_2$. Similar relationships exist for other distributions.

Solution 5:

For continuous probability distributions, it is not useful to talk about $P(X=x_0)$, since this probability is zero. It is only useful to talk about the probabilities of $X$ being in sets with a positive length, like a nontrivial interval, or a union of several intervals. So, instead of asking the probability of $X=x_0$, we ask the probability that $X$ is near $x_0$. To be precise, we talk about the probability that $X$ is in an interval of length $\delta$ centered at $x_0$, for some $\delta>0$. That is, $$ P(x_0-\delta/2<X<x_0+\delta/2) $$ As $\delta\to 0$, the above probability will approach zero. How quickly will it approach zero? It turns out that there will exist a number $f(x_0)$ for which $$ P(x_0-\delta/2<X<x_0+\delta/2)\approx \delta f(x_0) $$ This means that in $P(x_0-\delta/2<X<x_0+\delta/2)$, the probability that $X$ in an interval around $x_0$ is approximated by $f(x_0)$ times the length of that interval. Furthermore, the relative error of this approximation approaches zero as the length of the interval approaches zero.

This qnuanty $f(x_0)$ is also the value of the pdf of $X$, and the preceding discussion is a way of giving it precise meaning.