Why is the probability that a continuous random variable takes a specific value zero?

The problem begins with your use of the formula

$$ Pr(X = x) = \frac{\text{# favorable outcomes}}{\text{# possible outcomes}}\;. $$

This is the principle of indifference. It is often a good way to obtain probabilities in concrete situations, but it is not an axiom of probability, and probability distributions can take many other forms. A probability distribution that satisfies the principle of indifference is a uniform distribution; any outcome is equally likely. You are right that there is no uniform distribution over a countably infinite set. There are, however, non-uniform distributions over countably infinite sets, for instance the distribution $p(n)=6/(n\pi)^2$ over $\mathbb N$.

For uncountable sets, on the other hand, there cannot be any distribution, uniform or not, that assigns non-zero probability to uncountably many elements. This can be shown as follows:

Consider all elements whose probability lies in $(1/(n+1),1/n]$ for $n\in\mathbb N$. The union of all these intervals is $(0,1]$. If there were finitely many such elements for each $n\in\mathbb N$, then we could enumerate all the elements by first enumerating the ones for $n=1$, then for $n=2$ and so on. Thus, since we can't enumerate the uncountably many elements, there must be an infinite (in fact uncountably infinite) number of elements in at least one of these classes. But then by countable additivity their probabilities would sum up to more than $1$, which is impossible. Thus there cannot be such a probability distribution.


I'll elaborate on my comment. I claim that the statement "The probability that a continuous random variable takes on a specific value actually equal zero?" is false. I'll stick with the definition that a continuous random variable takes values in an uncountable set, or, to be more precise, that no countable subset has full measure. It is the one used by Davitenio, and in the intro of this Wikipedia article.

Take your favorite real-valued continuous random variable; call it $X$. Flip a well-balanced coin. Define a random variable $Y$ by:

  • If the coin shows heads, then $Y=X$;
  • If the coin shows tails, then $Y=0$.

The random variable $Y$ has the same range as $X$: any value taken by $X$ can be achieved by $Y$ provided that the coin shows heads. Hence, it is continuous. However, with probability at least $1/2$, we have $Y=0$, so that one specific value has non-zero probability.

The good notion here would be the notion of non-atomic measure. An atom is a point with positive measure, so a random variable which doesn't take any specific value with positive probability is exactly a random variable whose image measure is non-atomic. This is a tautology.

=====

Another definition of "continuous random variable" is a real-valued (or finite-dimensional-vector-space-valued) random variable whose image measure has a density with respect to the Lebesgue measure. Yes, even Wikipedia gives different definitions to the same object.

If $X$ is a continuous random variable with this definition, then is a function $f$, non-negative and with integral equal to $1$, such that for any Borel set $I$, we have $\mathbb{P} (X \in I) = \int_I f(x) dx$. Since a singleton has zero Lebesgue measure, we get $\mathbb{P} (X = x) = 0$ for all $x$.

=====

My take on the subject (warning: rant): I really, really don't like the use of "continuous random variable", and more generally the use of "continuous" in opposition to "discrete". These are the kind of terms which are over-defined, so that you can't always decide what definition the user has in mind. Even if it is quite bothersome, I prefer the use of "measure absolutely continuous with respect to the Lebesgue measure", or with some abuse, "absolutely continuous measure", or "measure with a density". With even more abuse, "absolutely continuous random variable". It is not pretty nor rigorous, but at least you know what you are talking about.

=====

PS: As for why your proof does not work, Joriki's answer is perfect. I would just add that the formula

$$\mathbb{P} (X = x) = \frac{\# \{ \text{favorable outcomes} \} }{\# \{\text{possible outcomes}\}}$$

only work with finite probability spaces, and when all the outcomes have the same probability. This is what happens when you have well-balanced coins, non-loaded dices, well-mixed card decks, etc. Then, you can reduce a probability problem to a combinatorial problem. This does not hold with full generality.


As I mentioned in the comments, a continuous random variable is one where its cumulative distribution function is continuous. This would imply that the domain is uncountable, but the domain being uncountable does not imply that it is a continuous random variable. I am using the definition given in Statistical Inference by Casella and Berger, which is not a PhD level text, but maybe a Masters level text, i.e., no measure theory is involved.

Therefore, the counterexample given by @D.Thomine is a good counterexample to your thoughts. You can have a random variable with an uncountable domain that has nonzero probability for some values. But, it is not a continuous random variable because the CDF would have a jump at such points, and therefore would not be continuous.

Casella and Berger shows, for a continuous random variable,

$$0 \leq P(X = x) \leq P(x - \epsilon < X \leq x) = F(x) - F(x - \epsilon)$$

for all $\epsilon > 0$. Taking the limit of both sides, as $\epsilon$ decreases to 0, gives

$$0 \leq P(X = x) = \lim_{\epsilon \downarrow 0} [F(x) - F(x - \epsilon)] = 0$$ by the continuity of $F(x)$.


This link contains a good self-contained and simple explanation. Most answers seem to introduce sub-topics which are not particularly helpful for someone looking for a preliminary idea.