Everything in the Power Set is measurable?
Im taking a class in graduate probability. My background is in engineering (very used to math in an applied sense). I am also taking an undergraduate class in real analysis along side (should have taken it before, but I couldn't) I have a couple of questions:
We're spending time looking at measurable functions on measurable sets. The definition of a "measurable set" is one who lies in a sigma algebra. My conceptual understanding of a sigma algebra (I know the technical def: countable additivity, etc.) is the resolution with which we understand a certain space - the sets that can be
measured - even more simply: the sets we can actually use. We say a sigma algebra is the "domain" of our measure. In other
words, a (prob) measure can't measure just any old arbitrary set/sets of set. We define a sigma algebra to handle this, and say our measure operates over this sigma algebra. However, the power set is actually a sigma algebra (the largest one, according to our definition), and
yet not every element of the power set is measurable? I'm having a
little trouble reconciling my conceptual understanding of a sigma
algebra (the behave good-measurable sets) with its actual def (which gives us the power set dilemma).How does the Borel Sigma Algebra fit into this conceptual understanding?
How about non measurable sets?
Is there a concept of the largest sigma algebra of only measurable sets, which is a subset of the power set?
Formally, measure (resp. probability) theory requires us to works with a triple $(\Omega, \mathcal{F}, P)$ where $\Omega$ is the space we are working on, $\mathcal{F}$ is a $\sigma-$algebra and $P$ is a (probability) measure which maps elements of $\mathcal{F}$ to numbers (between $0$ and $1$). We call the elements of $\mathcal{F}$ the "$\mathcal{F}$ measurable sets". For any non-trivial $\Omega$, you will have many potential $\sigma-$algebras that you can use in the place of $\mathcal{F}$. As you say, one option is to take $\mathcal{F} = 2^\Omega$ (the power set of $\Omega$) to be our $\sigma-$algebra. The problem with this choice is that every subset of $\Omega$ is in the power set--everything is measurable here. Why is that an issue? Among other things, it's often too big for $P$ to have nice properties. In many (most, honestly) cases, finding nice properties we want $P$ to have is what really drives the probability, not the particulars of $\mathcal{F}$.
Stefan gives the standard non-probabilistic example of this in the comments. The Lebesgue measure, which is the natural notion of volume on the real line, is not compatible with the power set as the $\sigma-$algebra in our triple, so we need to pick a new one. The definition of the Borel $\sigma-$algebra is that it is the smallest $\sigma-$algebra containing the open intervals (which had better be measurable if we are going to define volume). Since this $\sigma-$algebra is compatible with the intuitive notion of volume, it is therefore the smallest $\sigma-$algebra we can choose with the property that $\mu\{(a,b)\} = b-a$ for all open intervals $(a,b)$. Why not stop here? Not all subsets of Borel sets of measure $0$ are measurable and it is often nice for the sake of theory to not have to worry about those sets. The Lebesgue $\sigma-$algebra is what you get if you insist that all subsets of sets of measure zero are measurable. In this case, as in many cases, because this $\sigma-$algebra is so natural we often drop the formalism and just say that a set is "measurable" or "not measurable" on the real line, when what we really mean is that it is measurable with respect to the Lebesgue $\sigma-$algebra or not measurable with respect to the Lebesgue $\sigma-$algebra. I believe that the last issue is the source of your confusion. Whatever you were reading dropped that they were referring to the Lebesgue $\sigma-$algebra.
It is not too difficult and not too trivial to construct a set which is Lebesgue measurable but not Borel measurable. In general, most sets you can write down will end up being Borel. By contrast, constructing sets which are not Lebesgue measurable requires using something like the axiom of choice. Analysts are fond of saying that if you can write it down explicitly, it is Lebesgue measurable.
Let me make one quick comment about why probabilists like to use the Borel $\sigma-$algebra rather than the Lebesgue $\sigma-$algebra. For an analyst, the definition of a function being measurable is that the inverse image of open sets is measurable. Since probabilists don't require our spaces to have topologies, this really doesn't work for us. For a probabilist, the definition of a function being measurable is that the inverse image of a measurable set is measurable. The Borel $\sigma-$algebra has the nice property that if you compose two Borel measurable functions, you get another Borel measurable function in either definition. This property fails badly for Lebesgue measurable functions with the analysts' definition of measurable.
I encourage reading Francis Edward Su's paper "The Banach-Tarski Paradox" for a good introduction to unmeasurable sets. I borrow from this paper the following broad-strokes example.
It is possible to partition the unit circle into a countably infinite number of sets such that A) each equivalence class is congruent to the others by some rotation of $x\cdot\pi$ radians for rational $x$, and B) no two points in the same equivalence class can be rotated into each other by such a rotation. Assuming a countably additive measure on our equivalence classes, we want to say that the total measure of this set (call the set $X$ and the total measure of its members $\mu(X)$) is, say, 1.
But of course we can partition it further into $X',X''$ such that both are countable and $\mu(X')+\mu(X'')=1$. But note that since $X$ and $X''$ are both countable we can establish an isomorphism $f:X''\to X$, and $x\in X''$ can always be rotated into $f(x)$; same with $X'$. If our measure is rotation invariant, then $\mu(X)=\mu(X')=\mu(X'')=1$, which is clearly impossible. What this shows is that we cannot produce a countably additive, rotation-invariant measure on the unit circle if we want to extend that measure to every subset. The Vitali sets used in this example must be excluded from the sets we try to define such a measure over.