So what really is a random variable?

Solution 1:

I'm going to answer this question from a slightly different perspective. Or, rather, from two different perspectives, since they're both relevant:

From one historical and practical perspective, what random variables are supposed to represent is simply unknown variables whose value is uncertain in some way that we can quantify — the prototypical example being the result of a dice roll that hasn't been rolled yet.

We can then algebraically manipulate these unknown variables to obtain expressions for other variables whose value may also be uncertain (like, say, the winner of the game being played with these dice that haven't been rolled yet), and ask questions about how uncertain we are about the values of these dependent random variables and how likely they are to take a particular value or values, given what we know / believe / assume about the likelihoods of the original unknown variables (such as the dice rolls) taking specific values.

The other perspective arises from trying to rigorously formalize the intuitive and pragmatic concept described above.

For this, we need not only rigorous rules for how to manipulate algebraic expressions involving unknown variables (which we already historically had, long before rigorous probability theory became a thing) but also a rigorous way to specify how likely these "quantifiably unknown" variables are to take particular values and a way to take these quantified distributions of the random variables over their possible values and use them to calculate the corresponding distributions for new variables obtained by logically and algebraically manipulating the original ones.

Historically, this formalization has evolved over time. The earliest formalizations of probability theory simply assigned probabilities to discrete independent events, which works fine as long as we're dealing with things like a series of independent dice rolls that can each only take one of a finite set of discrete possible values. But for formalizing things like the random location of a dart thrown at a dartboard, we need to allow our random variables to range over a continuous range of possible values, which introduces apparent paradoxes (like the probability of the dart hitting any given point being zero, yet it still always hitting some point) that our formalization needs to handle. And for dealing with "random variables" like the trajectory of a diffusing microscopic particle over time or the temperature at every point on the Earth's surface two days from now, we need an even more advanced formalization. And, at some point along the way, we also need to figure out how to really rigorously deal with dependencies between random variables, which historically wasn't a trivial thing at all.

The current formalization that we've settled on is the one that's already been described in other answers: we define random variables as measurable functions from a probability space to a measurable space and then define rules for algebraically manipulating these functions as if they were just elements of their codomain, plus some useful extra rules for things like conditioning a random variable on an event or taking the (conditional) expected value of a random variable and so on.

But the important thing to realize is that all this formalism involving sigma-algebras and measures and functions exists just to create a rigorous foundation for the intuitive concept of a "variable with an uncertain value", while avoiding all the various paradoxes that may arise in various edge cases if one attempts to do so in a more naïve manner.

In particular, after having learned (and, hopefully, on some level understood) these definitions in an undergraduate probability theory class, most mathematicians or statisticians will never again deal directly with low-level stuff like sample spaces and sigma-algebras. In practice, they are nearly always simply assumed to exist and to be sufficiently fine-grained to allow defining all the actual random variables and their possible interdependencies that one happens to need for a particular calculation.


Ps. Anyway, to answer your literal question, yes, both "the number of heads obtained on a coin toss" and "the number of tails obtained on a coin toss" are valid random variables, and the correspond to your two $X$ functions. For a fair coin, as in your example, both of these random variables have an expected value of $\frac12$. (Not "a probability of $\frac12$" — events have probabilities, random variables have values.)

Note that, as defined, your two $X$'s are dependent random variables, since they describe the same coin toss (which is the only thing your probability space contains). In particular, as defined, their sum is always exactly $1$ in any event. For two independent fair coin tosses, the sum would instead be a binomially distributed random variable $S$ with $P(S = 0) = P(S = 2) = \frac14$ and $P(S = 1) = \frac12$. But to define that, you'd need a bigger probability space.

Solution 2:

A real-valued random variable is just a measurable function from $\Omega$ to $\mathbb{R}$

For practical purposes you can forget the word measurable and think of it as just a function.

It's as simple as that. It's just a fancy/confusing name "random variable"
which I guess is there mainly for historical reasons.

One could argue that a random variable (r.v.) is neither a variable, nor is random.
It's just a function from $\Omega$ to the reals.

Of course if it's not a real-valued r.v. but if it takes values in some other set $S$ then well... you just replace $\mathbb{R}$ with $S$ in that definition.

Solution 3:

The definition is correct, but your "with a probability of $1/2$ each" is nonsense. Probabilities are assigned to events, which are measurable subsets of the sample space, not to random variables.

Your first $X$ ($1$ for heads, $0$ for tails) is a random variable. Your second is another random variable. A third would be $0$ for both heads and tails, and a fourth would be $1$ for both heads and tails.

But we generally don't restrict the values of the random variables to $0$ and $1$. If we did, they'd just be the indicator functions of events. Usually we allow real values.

EDIT: Perhaps I should mention that working probabilists usually don't think this way: the focus is on the random variables and their distributions, while the probability space is hardly mentioned. See my answer here (and read the other answers too while you're at it).

Solution 4:

I would strongly urge you not to consider the set $\{ \mathrm{heads}, \mathrm{tails} \}$ to be your $\Omega.$ You just can't do much with an $\Omega$ that has only two elements in it.

When I flip a coin I expect the result to be either "heads" or "tails", which would be signified by the value of a random variable. Either I say the space $S = \{ \mathrm{heads}, \mathrm{tails} \}$ so that the value of the random variable is literally heads or tails, or I might say that $S = \{ 0, 1 \}$, that $X(1) = 1$ means that flip number $1$ is heads, and that $X(1) = 0$ means that flip number $1$ is tails.

If I want to model a sequence of three tosses of a fair coin, then I want something like the variables $X(1),$ $X(2),$ $X(3),$ where each $X(t)$ has equal chance of being heads or tails and each $X(t)$ is independent from the other two. In particular, there are eight different possible sequences of outcomes:

$$ HHH, HHT, HTH, HTT, THH, THT, TTH, TTT. $$

In order for this to be possible, I need $\Omega$ to have at least eight elements, because each element of $\Omega$ determines the values of all three variables $X(1),$ $X(2),$ and $X(3).$ No one element of $\Omega$ can produce more than one of the sequences of outcomes listed above.