Why isn't the product $\sigma$-algebra defined as the pre-image $\sigma$-algebra of the canonical projections

If instead of an arbitrary $f$, you use the two projections $\pi_1,\pi_2$ to generate a $\sigma$-algebra, then that will coincide with what is usually called the product $\sigma$-algebra (the one generated by "rectangles"). Using an arbitrary $f$ is kind of irrelevant because you're not exploiting the product structure.

More generally, let $\{(X_i,\mathscr{A}_i)\}_{i\in I}$ be measurable spaces, where $I$ is any non-empty index set. Let $X:= \prod\limits_{i\in I}X_i$ be the cartesian product of the sets (i.e the collection of all functions $f:I\to \bigcup_{i\in I}X_i$ such that for each $i\in I$, $f(i)\in X_i$... which is typically just written as $(x_i)_{i\in I}$ rather than the function $f$ notation). On the product space $X$, we can define two obvious $\sigma$-algebras: the product and box $\sigma$-algebras (but see my remarks below about terminology).


1. Product $\sigma$-algebra.

This is the smallest $\sigma$-algebra, $\mathscr{A}_{\text{prod}}$, on the product space $X$ such that each projection $\pi_i:X\to X_i$ (defined as $\pi_i(f):= f(i)$) is $\mathscr{A}_{\text{prod}}$-$\mathscr{A}_i$ measurable. This means: \begin{align} \mathscr{A}_{\text{prod}}&=\sigma\left(\bigcup_{i\in I}\pi_i^{-1}(\mathscr{A}_i)\right) \end{align} where $\sigma (\cdots)$ means the $\sigma$-algebra generated by the collection of subsets specified inside (the intersection of all $\sigma$-algebras containing the set inside). We're just taking all those sets required to ensure each $\pi_i$ becomes measurable, and making a $\sigma$-algebra out of those. Note that it is common to denote $\mathscr{A}_{\text{prod}}$ as $\bigotimes_{i\in I}\mathscr{A}_i$.

A nice fact about the product $\sigma$-algebra is that because it plays nicely with the projections $\pi_i$, we have the following theorem (as always when dealing with "products"):

Theorem.

Let $\{(X_i,\mathscr{A}_i)\}_{i\in I}$, where $I$ is a non-empty index set, be a collection of measurable spaces, let $X$ be the cartesian product of $X_i$, and let $\mathscr{A}_{\text{prod}}$ be the product $\sigma$-algebra as above. For any measurable space $(Y,\mathscr{B})$, and any function $f:Y\to X$, we have that $f$ is $\mathscr{B}$-$\mathscr{A}_{\text{prod}}$ measurable if and only if for each $i\in I$ the component function $\pi_i\circ f$ is $\mathscr{B}$-$\mathscr{A}_i$ measurable.

The implication $\implies$ is trivial since composition of measurable functions (with correct $\sigma$-algebras) is again measurable. For the converse, you need to know that to prove $f$ is measurable, you just have to show that for some generating set of $\mathscr{A}_{\text{prod}}$, such as $\bigcup_{i\in I}\pi_i^{-1}(\mathscr{A}_i)$, all the preimages under $f$ lie in $\mathscr{B}$. But this is exactly what measurability of each $\pi_i\circ f$ means. This completes the proof.


2. Box $\sigma$-algebra.

We define the box $\sigma$-algebra $\mathscr{A}_{\text{box}}$ to mean the $\sigma$-algebra generated by all "measurable boxes/rectangles", i.e \begin{align} \mathscr{A}_{\text{box}}:=\sigma\left(\left\{\prod_{i\in I}A_i\,\bigg| \, \text{for all $i\in I$, }A_i\in \mathscr{A}_i\right\}\right) \end{align}


3. Relationship between Product and Box $\sigma$-algebras.

The main relationship between them is summarized in the following theorem:

Theorem.

With notation as above, we always have that $\mathscr{A}_{\text{prod}}\subset \mathscr{A}_{\text{box}}$. If we make the additional assumption that the index set $I$ is countable, then we have the opposite inclusion as well, and thus $\mathscr{A}_{\text{prod}}=\mathscr{A}_{\text{box}}$.

Finally, if there are uncountably many $i\in I$ such that $\mathscr{A}_i\neq \{\emptyset,X_i\}$ is not the trivial $\sigma$-algebra, then we have a strict inclusion $\mathscr{A}_{\text{prod}}\subsetneq \mathscr{A}_{\text{box}}$.

The first part of the proof is easy: for any $i\in I$ and measurable set $A_i\in \mathscr{A}_i$, the set $\pi_i^{-1}(A_i)$ is a product of sets $\prod_{j\in I}A_j$, where for $j\neq i$, $A_j=X_j$ (i.e we're only placing restrictions on the $i^{th}$ coordinate). This shows that the generating set of product $\sigma$-algebra is contained in the generating set for the box $\sigma$-algebra, hence we also have inclusions of the $\sigma$-algebras themselves.

Conversely, if $I$ is a countable index set, and we consider a measurable rectangle $\prod_{i\in I}A_i$, then this can be written as $\bigcap_{i\in I}\pi_i^{-1}(A_i)$. This is a countable intersection of sets in $\mathscr{A}_{\text{prod}}$, and thus still belongs to $\mathscr{A}_{\text{prod}}$. Since the generating set of $\mathscr{A}_{\text{box}}$ is contained in $\mathscr{A}_{\text{prod}}$, we have $\mathscr{A}_{\text{box}}\subset \mathscr{A}_{\text{prod}}$, hence completing the proof.

For the final statement, we may without loss of generality assume that for each $i\in I$, $\mathscr{A}_i\neq \{\emptyset, X_i\}$. For each $i\in I$, pick a measurable set $A_i\in \mathscr{A}_i\setminus \{\emptyset,X_i\}$, and consider $A=\prod_{i\in I}A_i$. This clearly lies in the box $\sigma$-algebra, but we claim this is not in the product $\sigma$-algebra. To see this, note that \begin{align} \mathscr{A}_{\text{prod}}&:=\sigma\left(\bigcup_{i\in I}\pi_i^{-1}(\mathscr{A}_i)\right)=\bigcup_{\substack{C\subset I\\ \text{$C$ countable}}}\sigma\left(\bigcup_{i\in C}\pi_i^{-1}(\mathscr{A}_i)\right), \end{align} where "countable" means finite or countably infinite. The second equality is a general measure theory exercise: the inclusion $\supset$ is clear, and you can directly verify that the set on the right is a $\sigma$-algebra (the key thing being countable union of countably index sets is again a countable index set), and the RHS contains the generating set $\bigcup_{i\in I}\pi_i^{-1}(\mathscr{A}_i)$ of the LHS. This proves the second equality.

So, suppose for the sake of contradiction that $A$ belongs to the product $\sigma$-algebra. Then, by the above equality, there is some countable $C\subset I$ such that $A\in \sigma\left(\bigcup\limits_{i\in C}\pi_i^{-1}(\mathscr{A}_i)\right)$. This implies that $A$ "depends only on the coordinates in $C$". To make this statement more precise, let $\pi_C:\prod_{i\in I}X_i\to \prod_{i\in C}X_i$, $\pi_C(f):=f|_C$ be the canonical projection defined by restricting the function to the smaller domain $C$. Also, for each $j\in C$, let $p_j:\prod_{i\in C}X_i\to X_j$ be the usual projection $p_j(g):=g(j)$. The relationship between all the projections is that for $i\in C$, we have $\pi_i=p_i\circ \pi_C$ (just unwind the definitions on the RHS). So, we have \begin{align} A\in \sigma\left(\bigcup_{i\in C}\pi_i^{-1}(\mathscr{A}_i)\right)&= \sigma\left(\bigcup_{i\in C}\pi_C^{-1}(p_i^{-1}((\mathscr{A}_i)))\right)\\ &=\sigma\left(\pi_C^{-1}\left(\bigcup_{i\in C}p_i^{-1}(\mathscr{A}_i)\right)\right)\\ &\subset \pi_C^{-1}\left(\text{power set of $\prod_{i\in C}X_i$}\right) \end{align} This shows us that there is some subset $G\subset \prod_{i\in C}X_i$ such that $A=\pi_C^{-1}(G)$. So, now if we fix an index $j\in I\setminus C$ (such an index exist since $I$ is uncountable while $C$ is countable), then \begin{align} A_j=\pi_j(A) = \pi_j(\pi_C^{-1}(G))\in \{\emptyset,X_j\} \end{align} which contradicts our assumption about $A_j$. This completes the proof. Note that the last part is because (assuming things are non-empty), an element of $\pi_C^{-1}(G)$ is by definition a function $f:I\to \bigcup_{i\in I}X_i$ such that its restriction $f|_C$ belongs to $G$. Since $j\notin C$, if we modify $f(j)$ only, we get a new function $\tilde{f}$, which still belongs to $\pi_C^{-1}(G)$ because $\tilde{f}|_C=f|_{C}\in G$. Since we can change the $f(j)$ value arbitrarily, we can ensure we get anything in $X_j$.


Final Remarks.

The way I introduced things here, the analogy with topology should be strikingly similar, with the one exception that for topology, the product and box topologies coincide provided the index set is finite (merely countable is not enough anymore). So, note that in particular for the index set $I=\{1,2\}$ so that $X=X_1\times X_2$, the two descriptions of the $\sigma$-algebras are equivalent.

I think the terminology I have introduced here is the most appropriate one, but I should mention that (in Probability) I've also seen $\mathscr{A}_{\text{prod}}$ being described as "the cylindrical $\sigma$-algebra", and $\mathscr{A}_{\text{box}}$ being described as "the product $\sigma$-algebras"; this second bit it pretty absurd to me, so I avoid this terminology.