The $\sigma$-algebra of subsets of $X$ generated by a set $\mathcal{A}$ is the smallest sigma algebra including $\mathcal{A}$

I am struggling to understand why it should be that the $\sigma$-algebra of subsets of $X$ generated by $\mathcal{A}$ should be the smallest $\sigma$-algebra of subsets of $X$ including $\mathcal{A}$.

Let me try to elucidate my understanding of the topic, in the hope that somebody patient and kind might be able to fill in the gaps.

If $X$ is a set, and $\mathcal{G}$ is any non-empty family of $\sigma$-algebras of subsets of $X$, then I am very happy that

$$ \bigcap \mathcal{G} := \left\{ E : E \in \Sigma, \forall \Sigma \in \mathcal{G} \right\},$$

the intersection of all the $\sigma$ algebras belonging to $\mathcal{G}$ is a $\sigma$-algebra of subsets of $X$.

Now, if $\mathcal{A}$ is any any of subsets of $X$, then defining

$$ \mathcal{G} := \left\{ \Sigma : \Sigma \ \textrm{is a } \sigma \textrm{-algebra of subsets of } X, \mathcal{A} \subseteq \Sigma \right\},$$

then we have by definition that $\mathcal{G}$ is a family of $\sigma$-algebras of subsets of $X$; also, since $\mathcal{P} X \in \mathcal{G}$ we have that it is non-empty. So $\Sigma_{\mathcal{A}} := \bigcap \mathcal{G}$, called the $\sigma$-algebra of subsets of $X$ generated by $\mathcal{A}$, is a $\sigma$-algebra of subsets of $X$. Because $\mathcal{A} \subseteq \Sigma$ for every $\Sigma \in \mathcal{G}$, we have $\mathcal{A} \subseteq \Sigma_{\mathcal{A}}$; thus $\Sigma_{\mathcal{A}}$ itself belongs to $\mathcal{G}$.

However, I cannot get my head around why it should be that $\Sigma_{\mathcal{A}}$ should be the smallest $\sigma$-algebra of subsets of $X$ including $\mathcal{A}$, perhaps because I am not entirely sure what this statement means explicitly (namely, I have problems interpreting 'smallest' and 'including')! I'd be very relieved if someone could try to explain this to me as it has been bugging me for a week now; I have a feeling that it might rely heavily on the $\bigcap$, but I'm not sure exactly how...

Let me make a general comment rather than a specific one, because the construction that you are having trouble with is one that is very common and very useful (though it does have its limitations; see below) so it is important and good to have it "down" properly.

You have the following situation: you are considering a certain type of object of interest. For simplicity, let's look at the earliest example that most students encounter, which is vector spaces. So, you are looking at vector spaces. Specifically, you are looking at a particular vector space $\mathbf{V}$.

The objects have sub-objects (subspaces). These are subsets of your original $\mathbf{V}$, which are also objects (vector spaces) in their own right. Not every subset is a subobject, but every subobject is a subset.

In this situation, it is often fruitful to consider the following problem:

Given a subset $S$ of $\mathbf{V}$, what is the smallest subspace of $\mathbf{V}$ that contains $S$?

That is, we want to find a $\mathbf{W}$ with the following properties:

$\mathbf{W}$ is a subspace of $\mathbf{V}$;
$S$ is contained in $\mathbf{W}$ ("...that contains $S$");
If $\mathbf{Z}$ is any subspace of $\mathbf{V}$ that contains $S$, then $\mathbf{W}\subseteq\mathbf{Z}$ ("... smallest ...")

This is the situation you have at hand, and it's also a very common situation that we encounter over and over again. Some examples:

Given a group $G$ and a subset $S$, to find the smallest subgroup of $G$ that contains $S$ (the "subgroup generated by $S$");
Given a group $G$ and a subset $S$, to find the smallest normal subgroup of $G$ that contains $S$;
Given a subset $S$ of the plane $\mathbb{R}^2$, to find the smallest convex set that contains $S$ (the "convex hull of $S$");
Given a set $X$ and a collection of subsets $\mathcal{S}\subseteq \mathcal{P}(X)$, find the smallest $\sigma$-algebra on $X$ that contains $\mathcal{S}$ (the case you have);
Given a set $X$ and a relation $R$ on $X$, find the smallest transitive relation on $X$ that extends $R$ (the "transitive closure");
Given a topological space $X$ and a subset $S$, find the smallest closed subset of $X$ that contains $S$ (the "closure of $S$").

and so on and so forth.

Now, in general, such a thing may not exist; or there may be minimal objects but no minimum object. For example, if in the last example above you replace "closed" with "open", there may be no such object: if $X=\mathbb{R}$ and $S=[0,1]$, there is no "smallest open set that contains $S$".

But in many situations, there is one single observation that lets you conclude that such as "smallest subobject" must exist. Namely, if you can show that the intersection of any collection of "subobjects" is again a "subobject". For the example with vector spaces: is the intersection of an arbitrary family of subspaces of $\mathbf{V}$ itself a subspace of $\mathbf{V}$? For the above examples:

Is the intersection of an arbitrary family of subgroups of $G$, itself a subgroup of $G$?
Is the intersection of an arbitrary family of normal subgroups of $G$ itself a normal subgroup of $G$?
Is the intersection of an arbitrary family of convex subsets of $\mathbb{R}^2$ itself a convex subset of $\mathbb{R}^2$?
Is the intersection of an arbitrary family of $\sigma$-algebras on $X$ itself a $\sigma$-algebra on $X$?
Is the intersection of an arbitrary family of transitive relations on $X$ itself a transitive relation on $X$?
Is the intersection of an arbitrary family of closed subsets of $X$ itself a closed subset of $X$?

When the answer is "yes", then the following construction will always show that there is such a thing as the "smallest subobject that contains $S$":

Take the family of all subobjects that contain $S$; then take the intersection of the family. That's the smallest subobject that contains $S$.

Why does this work?

Because:

(i) There is at least one subobject that contains $S$, (namely the original object itself; for $\sigma$-algebras, this would be $\mathcal{P}(X)$; for the transitive closure example, you would take the "total relation" $X\times X$).

(ii) Since the intersection of an arbitrary family of subobjects is a subobject (this is our assumption), then this intersection is a subobject.

(iii) Since each thing being intersected contains $S$, the intersection contains $S$.

This means that the intersection is indeed a subobject that contains $S$. Finally:

(iv) The intersection is always contained in each and every element of the family being intersected. So if $\mathbf{Z}$ is any subobject that contains $S$, then it is a member of the family being intersected, so the intersection is contained in $\mathbf{Z}$. This shows the intersection is indeed the "smallest subobject" with the desired properties.

So:

To find the smallest subspace of $\mathbf{V}$ that contains $S$, intersect all subspaces that contain $S$.
To find the smallest subgroup of $G$ that contains $S$, intersect all subgroups that contain $S$.
To find the smallest normal subgroup of $G$ that contains $S$, intersect all normal subgroups that contain $S$.
To find the smallest convex set that contains $S$, intersect all convex subsets of $\mathbb{R}^2$ that contain $S$.
To find the smallest $\sigma$-algebra that contains $S$, intersect all $\sigma$-algebras that contain $S$.
To find the smallest transitive relation that contains $R$, intersect all transitive relations that contain $R$.
To find the smallest closed subset that contains $S$, intersect all closed subsets that contain $S$.

And this works like magic. Voilá! You have shown that this object exists. It necessarily has the properties you want.

This is a "top-down" approach. Imagine yourself looking at the "big object", and you are "paring it down" until you get "just enough" for the object you want (intersections make things smaller; you are paring down stuff that may not be needed).

The problem? Like most magic spells, it doesn't really tell you much about the end product. The fact that the end product appeared "as if by magic" means that you are likely to be as clueless about the actual nature of the "smallest object" in question as you were when you started. You now know that there is such a thing, but you don't really know what it "looks like".

That is why in almost every situation like this, you also want a "bottom-up" description of this "smallest subobject that contains $S$". You want an explicit description of what it actually looks like. For the examples above:

The smallest subspace of $\mathbf{V}$ that contains $S$ is the set of all linear combinations of vectors in $S$.
The smallest subgroup of $G$ that contains $S$ is the set of all finite products of elements of $S$ and their inverses.
The smallest normal subgroup of $G$ that contains $S$ is the set of all finite products of conjugates of elements of $S$ and their inverses.
The smallest convex subset of $\mathbb{R}^2$ that contains $S$ is the set of all convex combinations of elements of $S$.
Asaf gives an explicit description of the smallest $\sigma$-algebra on $X$ that contains $S$ in his answer, described by starting from $S$.
The smallest transitive relation on a set $X$ that contains a given relation $R$ is the set of all pairs $(a,b)$ such that there exist a finite sequence $x_0,x_1,\ldots,x_n$ of elements of $X$ such that $x_0=a$, $x_n=b$, and $(x_i,x_{i+1})\in R$ for $i=0,\ldots,n-1$.
The smallest closed subset of a topological space $X$ that contains a given set $S$ is equal to $S\cup\partial S$ or to $S\cup S'$.

In each of these cases, one would have to show that the given description actually has the desired properties. This is a "bottom-up" approach.

The "top-down" description has the benefit of simplicity, that the "universal properties" that define the object are very clearly satisfied, and that they make proving results about how the "smallest object" relates to other objects easy. However, the "top-down" description is usually very hard to actually use to prove things about the specific smallest object. The "bottom-up" construction has the benefit of (usually) being a very concrete way of getting your hands on the object itself, making it easy to prove things about the object itself, but proving the universal properties is usually difficult. Thus, for example, the top-down definition of "subspace $\mathbf{V}$ generated by $S$" in the linear algebra setting makes it very hard to figure out things like the dimension of the subspace, or a basis, while the "bottom-up" approach makes that very easy, but then proving that the collection of all linear combinations forms a subspace is more difficult than simply taking an intersection of subspaces.

In most books or presentations, when discussing "the smallest X that contains S", you will see one of two approaches:

Define it as a big intersection, then prove a theorem that gives the "bottom-up" description; or
Give a "bottom-up" description; then prove the object described has the desired properties of being a subobject, containing S, and being the "smallest".

Whenever possible, you want both descriptions because they have complementary strengths and weaknesses.

Since Qiaochu gave a good answer, which I think should be satisfactory, I will describe another way of constructing the $\sigma$-algebra generated from $\mathcal A$. As Arturo writes in the comments, this is similar to the difference between defining a subgroup generated by $X$ as the intersection of all subgroups containing $X$, and closing $X$ under the needed operations.

There is an alternative (although slightly more complicated if you want to go into the details) way to build $\Sigma_\mathcal A$.

$\Sigma^0_0=\Pi^0_0=\text{ finite intersections from }\mathcal{A}$
For $\alpha$ a countable ordinal denote: $$ \Sigma^0_\alpha=\{\bigcup_{i\in\mathbb N} A_i\mid A_i\in\bigcup_{\beta<\alpha}\Pi^0_\beta\},\quad \Pi^0_\alpha = \{X\setminus A\mid A\in\Sigma^0_\alpha\},\quad \Delta^0_\alpha=\Sigma^0_\alpha\cap\Pi^0_\alpha $$

It is simple to prove that $\Sigma^0_\alpha$ is closed under countable unions and finite intersections (for $\alpha>0$) and $\Pi^0_\alpha$ closed under countable intersections and finite unions; and $\Delta^0_\alpha$ is closed under complements as well finite intersections and finite unions.

Slightly less simple is to prove that $\Sigma^0_\alpha\subseteq\Sigma^0_\beta$ for $\alpha<\beta$ (deduce from this fact that $\Sigma^0_\alpha$ and $\Pi^0_\alpha$ are subsets of $\Delta^0_{\alpha+1}$)

Now let $\Delta=\bigcup_{\alpha<\omega_1} \Delta^0_\alpha$, where $\omega_1$ is the first uncountable ordinal (i.e. $\aleph_1$). Assuming the axiom of choice we have that countable unions of countable ordinals are countable, therefore $\Delta$ is closed under countable unions and complements.

Suppose $A_i\in\Delta$ for $i\in\mathbb N$, then there is some $\alpha<\omega_1$ such that $A_i\in\Sigma^0_\alpha$, therefore $\bigcup A_i\in\Sigma^0_\alpha$, since $\Sigma^0_\alpha\subseteq\Delta^0_{\alpha+1}$ we have that the countable union is in $\Delta$, similarly for countable intersections.

I claim that $\Delta=\Sigma_\mathcal A$, while it is obvious that $\mathcal A\subseteq\Delta$, it is not at all obvious why this is the smallest $\sigma$-algebra. I will leave this proof out, as I am sure I have written enough as it is. (I'll do some looking for a good reference for a proof, and post it here.)

In reality what we did was to close up $\mathcal A$ under countable intersections and unions, as well complements. Instead of building it from the top down, we have built it from the bottom up.

This construction is usually done over the open sets of a topological space, resulting Borel sets of $X$, and it is an important construction in descriptive set theory.

The superscript of $0$ in this notation is to point out that this is the Borel hierarchy. There is another, somewhat similar, hierarchy called The projective hierarchy denoted by $\Sigma^1_\alpha$ (as well as $\Pi$ and $\Delta$), which to some extent expands the Borel hierarchy.

How expanding? The Borel sets are exactly $\Delta^1_1$, so pretty much from the first stage of the construction.

"Smallest" means that it is contained in every $\sigma$-algebra containing $\mathcal{A}$. In this case we are guaranteed that such a thing exists because the intersection of (an arbitrary family of) $\sigma$-algebras is a $\sigma$-algebra, so you can define the generated $\sigma$-algebra as the intersection of all $\sigma$-algebras containing $\mathcal{A}$.

A slightly more abstract way to say this, which you might find valuable, is that the collection of all $\sigma$-algebras containing $\mathcal{A}$ is a poset ordered by inclusion, and you want the minimal element of this poset (which exists because you can take intersections).

The $\sigma$-algebra of subsets of $X$ generated by a set $\mathcal{A}$ is the smallest sigma algebra including $\mathcal{A}$

Related

Recent Posts