On clarifying the relationship between distribution functions in measure theory and probability theory

I recently found myself confusing concepts from measure theory and probability theory, so I'd like to get an idea for what I'm misunderstanding. This definition is what started it all:

A sequence $\{X_{n}\}$ of random variables converges in distribution to $X$ if $$\lim_{n \to \infty} F_{n}(x) = F(x)$$

for every number $x \in \mathbb{R}$ at which $F$ is continuous.

Concerns:

1) Recalling that random variables are really just measurable functions, am I to understand that each distinct measurable function is associated with a unique Distribution Function by which its probability content is evaluated?

I was always under the impression that we use the Lebesgue measure (and its corresponding Distribution Function) to calculate the probability of random variables we encounter in general (except in abstract spaces). Is this just flat out wrong?

2) I also know that for any increasing, right-continuous function $F: \mathbb{R} \to \mathbb{R}$, there is a unique Borel measure $\mu_{F}$ such that $\mu_{F}((a,b]) = F(b) - F(a)$ for all $a,b$. Conversely, given a Borel measure on $\mathbb{R}$ that is finite and bounded on all Borel sets, we can uniquely associate it with a real-valued, right-continuous and increasing function.

Okay, so by Littlewood's principles, we know that measurable functions are nearly continuous. So this could justify associating each random variable $X_{n}$ with a unique Distribution Function $F_{n}$. But random variables (i.e., measurable functions) don't have to be increasing, so that adds to my confusion.

Short Summary:

1) To calculate the probability of a generic real-valued random variable, do we just use the CDF associated with Lebesgue measure, or does the random variable have its own CDF?

2) If we can associate a CDF to a general random variable, how is this done is the function is not increasing?


A (real valued) random variable is just a measurable map $X : \Omega \to \Bbb{R}$, where $(\Omega, \mathcal{F}, \Bbb{P})$ is an arbitrary probability space.

What we can then do is to consider the push-forward measure $\Bbb{P}_X = X_\ast \Bbb{P}$ of $\Bbb{P}$ by $X$. This is sometimes called the distribution of $X$. By definition, we have

$$ X_\ast \Bbb{P} (E) = \Bbb{P}(X^{-1}(E)) = \Bbb{P}(X \in E), $$

for any (measurable) $E \subset \Bbb{R}$, so that (check this) $\Bbb{P}_X$ is a probability measure on $\Bbb{R}$. Note that the last expression is the one that most mathematicians in probability theory would use.

Now - as you already stated yourself - we can associate to every (locally finite) measure $\mu$ on $\Bbb{R}$ the distribution function $F = F_\mu$ of $\mu$, given by

$$ F_\mu (x) = \mu((-\infty, x]). $$

In this way, we can also associate to the measure $\Bbb{P}_X$ the distribution function $F_X = F_{\Bbb{P}_X}$ which satisfies

$$ F_X (a) = \Bbb{P}_X ((-\infty, a]) = \Bbb{P}(X \in (-\infty, a]) = \Bbb{P}(X \leq a). $$

Sometimes, this is also called the distribution of $X$ (note that we now call the measure $\Bbb{P}_X$ and it's distribution function $F_X = F_{\Bbb{P}_X}$ the "distribution of $X$". But as each of these two objects uniquely determines the other, this is not much of a problem).

Finally, all this has not much to do with the properties of $X$ as a function (i.e. with properties like continuity of $X$, ...). To see this, note that $\Omega$ is an arbitrary probability space. Hence, it does not make sense in general to talk about continuity of $X$, for example.

There is a different notion of a continuous random variable. Here, we call $X$ a continuous random variable, if the distribution function $F_X$ is continuous. This is equivalent to the condition $\Bbb{P}(X = a) = 0$ for all $a$ (why?) and thus has nothing to do with continuity of $X$ as a function (as above, this concept does not even make sense in general).

Short summary:

1) Each real-valued random variable comes with it's own cumulative distribution function. If we place additional assumptions on $X$, then it might be the case that this distribution function is given by the one associated to Lebesgue-measure. Note that we have to restrict Lebesgue-measure to (e.g.) an interval of length $1$ to do this, because otherwise this is no probability measure.

2) As explained above, the associated CDF is given by

$$ F_X (a) = \Bbb{P}(X \leq a). $$