Why $\sigma$-algebras represent information, and what information does $\sigma(X)$ represent?
Solution 1:
Maybe it will be helpful to look at the case where $X$ has a small finite range, e.g., $X:\Omega\to \{1,2,3,4\}$. In this case, we can partition the points in $\Omega$ into four disjoint subsets, $A:=X^{-1}(1)$, $B:=X^{-1}(2)$, $C:=X^{-1}(3)$, and $D:=X^{-1}(4)$. Given the value of $X$ at some $\omega$, we then know which of $A$, $B$, $C$, or $D$ contains $\omega$, so we can tell whether or not each of the events $A$, $B$, $C$, or $D$ happened, and $\sigma(X)$ must contain at least these four events. But we can also tell whether $B\cup C$ happened, since this event occurs just when $X(\omega)\in\{2,3\}$. So, $B\cup C\in \sigma(X)$ as well, and we also have several more sets; explicitly, $\sigma(X)$ is the Boolean algebra generated by $A$, $B$, $C$, and $D$: $$\sigma(X)=\{\emptyset,A,B,C,D,A\cup B,A\cup C,A\cup D,B\cup C,B\cup D,C\cup D,A\cup B\cup C,A\cup B\cup D,A\cup C\cup D, \Omega\}.$$
In the general case, any element of $\sigma(X)$ will be a union $\cup_{s\in S} X^{-1}(s)$, where $S$ is a measurable subset of the range of $X$.
Solution 2:
A sense in which the $\sigma$-algebra generated by a random variable represents information is given by the Doob-Dynkin Lemma:
Result: Let $(\Omega,\mathcal{F})$ be a measurable space and $f:\Omega\to\mathbb{R}$ be a random variable. Let $g:\Omega\to\mathbb{R}$ be a function. Then $g$ is $\sigma(f)$-measurable if and only if there is a measurable function $h:\mathbb{R}\to\mathbb{R}$ such that $g=h\circ f.$
Here is a sense in which a $\sigma$-algebra may fail to represent information (the example is from Billingsley): Let $\Omega=[0,1]$, $\mathcal{F}$ be the Borel sets and let $\mu$ be Lebesgue measure. Let $\mathcal{C}\subseteq\mathcal{F}$ be the $\sigma$-algebra consisting of countable sets and sets with countable complements. There is no random variable that generates $\mathcal{C}$. Since $\mathcal{C}$ contains all singletons, it should be in some sense perfectly informative. But the conditional expectation $\mathbb{E}_\mathcal{C}$ is equal to a constant function almost surely- and so is every $\mathcal{C}$-measurable function. To see this, note that $\mathcal{C}$ contains only events with Lebesgue measure $1$ or $0$. Let $g:\Omega\to\mathbb{R}$ be $\mathcal{C}$-measurable. Without loss of generality, let $g(\Omega)\subseteq [0,1]$. One of the sets $g^{-1}[0,1/2]$ and $g^{-1}[1/2,1]$ must have measure $1$. Say, it is $g^{-1}[1/2,1]$. Then one of the sets $g^{-1}[1/2,3/4]$ and $g^{-1}[3/4,1]$ must have measure $1$. Continuing this way, we get a decreasing sequence of closed intervals that all have measure $1$ under the distribution of $g$ and their diameter goes to $0$. So there exists a unique point $r$ in the intersection and $g^{-1}\{r\}$ has measure $1$. So $g$ is almost surely equal to the random variable that is constantly $r$. And $\mathcal{C}$ is completely uninformative.
Solution 3:
It seems reasonable to define "the information represented by $X$" as the following collection of events: $$ \mathcal{I} := \{A \in \mathcal{F}: \mathbf{1}_B(X(\omega)) =\mathbf{1}_A(\omega) \text{ for some $B \in \mathcal{B}$}\} $$ In other words, an event is in $\mathcal{I}$ if and only if we can identify its occurrence by observing whether or not $X$ falls into a particular set. It is straight-forward to show that $\sigma(X) = \mathcal{I}$ by using the Doob-Dynkin Lemma.