Notation of random variables

I am really confused about capitalization of variable names in statistics. When should a random variable be presented by uppercase letter, and when lower case?

For a probability $P(X \leq x)$, what do $x$ and $X$ mean here?


Solution 1:

You need to dissociate $x$ from $X$ in your mind—sometimes it matters that they are "the same letter" but in general this is not the case. They are two different characters and they mean two different things and just because they have the same name when read out loud doesn't mean anything.

By convention, a lot of the time we give random variables names which are capital letters from around the end of the alphabet. That doesn't have to be the case—it's arbitrary—but it's a convention. So just as an example here, let's let $X$ be the random variable which represents the outcome of a single roll of a die, so that $X$ takes on values in $\{1,2,3,4,5,6\}$. Now I believe you understand what would be meant by something like $P(X\leq 2)$: it's the probability that the die comes up either a 1 or a 2. Similarly, we could evaluate numbers for $P(X\leq 4)$, $P(X\leq 6)$, $P(X\leq \pi)$, $P(X\leq 10000)$, $P(X\leq -230)$ or $P(X\leq \text{any real number that you can think of})$. Another way to say this is that $P(X\leq\text{[blank]})$ is a function of a real variable: we can put any number into [blank] that we want and we end up with a unique number. Now a very common symbol for denoting a real variable is $x$, so we can write this function as $P(X\leq x)$. In this expression, $X$ is fixed, and $x$ is allowed to vary over all real numbers.

It's not super significant that $x$ and $X$ are the same letter here. We can similarly write $P(y\leq X)$ and this would be the same function. Where it really starts to come in handy that $x$ and $X$ are the same letter is when you are dealing with things like joint distributions where you have more than one random variable, and you are interested in the probability of for instance $P(X\leq \text{[some number] and } Y\leq\text{[some other number]})$ which can be written more succinctly as $P(X\leq x,Y\leq y)$. Then, just to account for the fact that it's hard to keep track of a lot of symbols at the same time, it's convenient that $x$ corresponds to $X$ in an obvious way.

By the way, for a random variable $X$, the function $P(X\leq x)$ a very important function. It is called the cumulative distribution function and is usually denoted by $F_X$, so that $$F_X(x)=P(X\leq x)$$

Solution 2:

When one writes $P(X\leqslant x)$, one usually means that $X:\Omega\to\mathbb R$ is a random variable and $x$ a real number, so that $P(X\leqslant x)=P(A)$ where $$ A=\{X\leqslant x\}=\{\omega\in\Omega\mid X(\omega)\leqslant x\}=X^{-1}((-\infty,x]). $$

Solution 3:

$P(X\leq x)$ means the probability that the random variable $X$ is less than or equal to the realization $x$. The lower case $x$ is a fixed constant, whereas $X$ is a random variable.