Independence Lemma, is it non-trivial?

I'm reading Steven E. Shreve's "Stochastic Calculus for Finance II, Continuous-Time models", and a bit confused on the Independence Lemma (Lemma 2.3.4). The lemma says:

Lemma 2.3.4 (Independence). Let $(\Omega,\mathscr{F},\mathbb{P})$ be a probability space and let $\mathscr{G}$ be a sub-$\sigma$-algebra of $\mathscr{F}$. Suppose the random variables $X_1,\dots,X_K$ are $\mathscr{G}$-measurable and the random variables $Y_1,\dots,Y_L$ are independent of $\mathscr{G}$ . Let $f(x_1,\dots,x_K,y_1,\dots,y_L)$ be a function of the dummy variables $x_1,\dots,x_K$ and $y_1,\dots,y_L$ and define $$g(x_1,\dots,x_K) = \mathbb{E}f(x_1,\dots,x_k,Y_1,\dots,Y_L).$$ (2.3.27). Then $$\mathbb{E}[f(X_1,\dots,X_K,Y_1,\dots,Y_L)\mid \mathscr{G}]=g(X_1,\dots,X_K). $$ (2.3.28)

Then the book further explains that

... As with Lemma 2.5.3 of Volume I, the idea here is that since the information in $\mathscr{G}$ is sufficient to determine the values of $X_1,\dots,X_K$, we should hold these random variables constant when estimating $f(X_1,\dots,X_K,Y_1,\dots,Y_L)$. The other random variables, $Y_1,\dots,Y_L$, are independent of $\mathscr{G}$ , and so we should integrate them out without regard to the information in $\mathscr{G}$ . These two steps, holding $X_1,\dots,,X_K$ constant and integrating out $Y_1,\dots,Y_L$, are accomplished by (2.3.27). We get an estimate that depends on the values of $X_1,\dots,X_K$ and, to capture this fact, we replaced the dummy (nonrandom) variables $x_1,\dots,x_K$ by the random variables $X_1,\dots,X_K$ at the last step. Although Lemma 2.5.3 of Volume I has a relatively simple proof, the proof of Lemma 2.3.4 requires some measure-theoretic ideas beyond the scope of this text, and will not be given.

OK.. I'm confused here... Is this "Independence Lemma" non-trivial?

In my mind I just think-- Since (2.3.27): $$g(x_1,\dots,x_K) = \mathbb{E}f(x_1,\dots,x_k,Y_1,\dots,Y_L).$$ , we have $$g(X_1,\dots,X_K) = \mathbb{E}f(X_1,\dots,X_K,Y_1,\dots,Y_L) . $$ , hence we get (2.3.28): $$\mathbb{E}[f(X_1,\dots,X_K,Y_1,\dots,Y_L)\mid \mathscr{G}]=g(X_1,\dots,X_K). $$

I don't understand why we need a lemma here to iterate something quite "straight-forward" and by instinct right.

I guess I must neglect something. There must be something non-trivial but I took for granted. What is that?


You have to be careful with respect to which variable you integrate: By definition,

$$g(x_1,\ldots,x_k) = \mathbb{E}f(x_1,\ldots,x_K,Y_1,\ldots,Y_L) = \int_\Omega f(x_1,\ldots,x_K,Y_1(\omega_Y),\ldots,Y_L(\omega_Y)) \, d\mathbb{P}(\omega_Y).$$

Hence,

$$g(X_1,\ldots,X_K)(\omega_\mathscr{G}) = \int f(X_1(\omega_\mathscr{G}),\ldots,X_K(\omega_\mathscr{G}),Y_1(\omega_Y),\ldots,Y_L(\omega_Y)) \, d\mathbb{P}(\omega_Y).$$

This means that we integrate with respect to the variable $\omega_Y$ whereas $\omega_\mathscr{G}$ s still fixed. In contrast,

$$\begin{align} \mathbb{E}[f(X_1,\ldots,X_K,Y_1,\ldots,Y_L) ] &= \int f(X_1(\omega_\mathscr{G}),\ldots,X_K(\omega_\mathscr{G}),Y_1(\omega_\mathscr{G}),\ldots,Y_L(\omega_\mathscr{G})) \, d\mathbb{P}(\omega_\mathscr{G}) \\ &\neq \mathbb{E}[g(X_1,\ldots,X_K)] \\ &= \int_\Omega d\mathbb{P}(\omega_\mathscr{G}) \int_\Omega f(X_1(\omega_\mathscr{G}),\ldots,X_K(\omega_\mathscr{G}),Y_1(\omega_Y),\ldots,Y_L(\omega_Y)) \, d\mathbb{P}(\omega_Y). \end{align}$$

Similar, considerations hold for the conditional expectations. Therefore, the "Independence Lemma" is not an obvious conclusion from the definition of $g$.


The proof of the Independent Lemma requires knowledge about Dynkin's $\pi$-$\lambda$ theorem. For simplicity, I consider one-dimensional case only, which can be generalized to multi-dimensional case easily.

The Independent Lemma: Let $(\Omega,\mathcal{F},P)$ be a probability space and let $\mathcal{G}\subseteq\mathcal{F}$ be a sub $\sigma$-algebra. Let $X,Y$ be random variables such that $X$ is $\mathcal{G}$-measurable and $Y$ is independent from $\mathcal{G}$. Let $f:\mathbb{R}^{2}\rightarrow\mathbb{R}$ be a Borel function such that $E\left[|f(X,Y)|\right]<\infty$ and $E\left[|f(x,Y)|\right]<\infty$ for each $x\in\mathbb{R}$. Define $g:\mathbb{R}\rightarrow\mathbb{R}$ by $g(x)=E\left[f(x,Y)\right]$. Then $E\left[f(X,Y)\mid\mathcal{G}\right]=g(X)$.

Proof: Firstly, we prove that the theorem is true for all function $f$ of the form $f=1_{C}$ for some $C\in\mathcal{B}(\mathbb{R}^{2})$. Let $\mathcal{P}=\{A\times B\mid A,B\in\mathcal{B}(\mathbb{R})\}$ and $\mathcal{L}=\{C\in\mathcal{B}(\mathbb{R}^{2})\mid\mbox{The theorem holds for }1_{C}\}$. Clearly $\mathcal{P}$ is a $\pi$-class (in the sense that $C_{1}\cap C_{2}\in\mathcal{P}$ whenever $C_{1},C_{2}\in\mathcal{P}$). We verify that $\mathcal{L}$ is a $\lambda$-class (in the sense that: (i) $\emptyset\in\mathcal{L}$, (ii) $C^{c}\in\mathcal{L}$ whenever $C\in\mathcal{L}$, and (iii) For any sequence $(C_{n})_{n}$ of pairwisely disjoint sets in $\mathcal{L}$, we have $\cup_{n}C_{n}\in\mathcal{L}$). Clearly $\emptyset\in\mathcal{L}$. Let $C\in\mathcal{L}$. Let $g_{C}$ and $g_{C^{c}}$be defined by $g_{C}(x)=E\left[1_{C}(x,Y)\right]$ and $g_{C^{c}}(x)=E\left[1_{C^{c}}(x,Y)\right]$. Observe that \begin{eqnarray*} g_{C^{c}}(x) & = & E\left[1_{C^{c}}(x,Y)\right]\\ & = & E\left[1-1_{C}(x,Y)\right]\\ & = & 1-g_{C}(x). \end{eqnarray*} It follows that \begin{eqnarray*} E\left[1_{C^{c}}(X,Y)\mid\mathcal{G}\right] & = & E\left[1-1_{C}(X,Y)\mid\mathcal{G}\right]\\ & = & 1-g_{C}(X)\\ & = & g_{C^{c}}(X). \end{eqnarray*} This shows that $C^{c}\in\mathcal{L}$ and hence condition (ii) is satisfied. Let $C_{1},C_{2},\ldots\in\mathcal{L}$ be pairwisely disjoint. Let $C=\cup_{n}C_{n}$. For each $n$, define $g_{n}:\mathbb{R}\rightarrow\mathbb{R}$ by $g_{n}(x)=E\left[1_{C_{n}}(x,Y)\right]$. Define $g:\mathbb{R}\rightarrow\mathbb{R}$ by $g(x)=E\left[1_{C}(x,Y)\right]$. Since $C_{1},C_{2},\ldots$ are pairwisely disjoint, we have $1_{C}=\sum_{n=1}^{\infty}1_{C_{n}}.$ Therefore, for each $x\in\mathbb{R}$, \begin{eqnarray*} g(x) & = & E\left[1_{C}(x,Y)\right]\\ & = & E\left[\sum_{n=1}^{\infty}1_{C_{n}}(x,Y)\right]\\ & = & \sum_{n=1}^{\infty}E\left[1_{C_{n}}(x,Y)\right]\\ & = & \sum_{n=1}^{\infty}g_{n}(x). \end{eqnarray*} By the Monotone Convergence Theorem (conditional expectation version), we have \begin{eqnarray*} E\left[1_{C}(X,Y)\mid\mathcal{G}\right] & = & E\left[\sum_{n=1}^{\infty}1_{C_{n}}(X,Y)\mid\mathcal{G}\right]\\ & = & \sum_{n=1}^{\infty}E\left[1_{C_{n}}(X,Y)\mid\mathcal{G}\right]\\ & = & \sum_{n=1}^{\infty}g_{n}(X)\\ & = & g(X). \end{eqnarray*} This shows that condition (iii) is satisfied. Next, we show that $\mathcal{P}\subseteq\mathcal{L}$. Let $C=A\times B$ for some $A,B\in\mathcal{B}(\mathbb{R})$. Define $g:\mathbb{R}\rightarrow\mathbb{R}$ by $g(x)=E\left[1_{C}(x,Y)\right]$. Observe that $1_{C}(x,Y)(\omega)=1_{A}(x)1_{Y^{-1}(B)}(\omega)$, so $g(x)=1_{A}(x)E\left[1_{Y^{-1}(B)}\right]=1_{A}(x)E\left[1_{B}(Y)\right]$. On the other hand, \begin{eqnarray*} E\left[1_{C}(X,Y)\mid\mathcal{G}\right] & = & E\left[1_{A}(X)1_{B}(Y)\mid\mathcal{G}\right]\\ & = & 1_{A}(X)E\left[1_{B}(Y)\mid\mathcal{G}\right]\\ & = & 1_{A}(X)E\left[1_{B}(Y)\right]\\ & = & g(X). \end{eqnarray*} Therefore $C\in\mathcal{L}$. Now, by the Dynkin $\pi$-$\lambda$ theorem, we have $\sigma(\mathcal{P})\subseteq\mathcal{L}$. However $\sigma(\mathcal{P})=\mathcal{B}(\mathbb{R}^{2})$ and $\mathcal{L}\subseteq\mathcal{B}(\mathbb{R}^{2})$, so $\mathcal{L}=\mathcal{B}(\mathbb{R}^{2})$.

Next, let $\mathcal{V}$ be the set of all functions $f$ such that the theorem holds for $f$. We verify that $\mathcal{V}$ is a vector space. Let $\alpha\in\mathbb{R}$, $f_{1},f_{2}\in\mathcal{V}$. Let $g_{1},g_{2}:\mathbb{R}\rightarrow\mathbb{R}$ be defined by $g_{i}(x)=E\left[f_{i}(x,Y)\right]$, for $i=1,2$. Define $f=\alpha f_{1}+f_{2}$ and $g:\mathbb{R}\rightarrow\mathbb{R}$ by $g(x)=E\left[f(x,Y)\right]$. Note that \begin{eqnarray*} g(x) & = & E\left[\alpha f_{1}(x,Y)+f_{2}(x,Y)\right]\\ & = & \alpha E\left[f_{1}(x,Y)\right]+E\left[f_{2}(x,Y)\right]\\ & = & \alpha g_{1}(x)+g_{2}(x). \end{eqnarray*} Now \begin{eqnarray*} E\left[\left(\alpha f_{1}+f_{2}\right)(X,Y)\mid\mathcal{G}\right] & = & \alpha E\left[f_{1}(X,Y)\mid\mathcal{G}\right]+E\left[f_{2}(X,Y)\mid\mathcal{G}\right]\\ & = & \alpha g_{1}(X)+g_{2}(X)\\ & = & g(X). \end{eqnarray*} This shows that $\alpha f_{1}+f_{2}\in\mathcal{V}$ and hence $\mathcal{V}$ is a vector space. In particular, $\mathcal{V}$ contains all simple functions.

Let $f:\mathbb{R}^{2}\rightarrow[0,\infty]$ be a non-negative Borel function. Define $g:\mathbb{R}\rightarrow[0,\infty]$ by $g(x)=E\left[f(x,Y)\right]$. Choose a sequence of simple functions $(f_{n})_{n}$ defined on $\mathbb{R}^{2}$ such that $0\leq f_{1}\leq f_{2}\leq\ldots\leq f$ and $f_{n}\rightarrow f$ pointwisely. For each $n$, let $g_{n}:\mathbb{R}\rightarrow\mathbb{R}$ be defined by $g_{n}(x)=E\left[f_{n}(x,Y)\right]$. For each $x\in\mathbb{R}$, by Monotone Convergence Theorem, we have \begin{eqnarray*} g(x) & = & E\left[f(x,Y)\right]\\ & = & \lim_{n\rightarrow\infty}E\left[f_{n}(x,Y)\right]\\ & = & \lim_{n\rightarrow\infty}g_{n}(x). \end{eqnarray*} By Monotone Convergence Theorem (conditional expectation version) again, we have \begin{eqnarray*} E\left[f(X,Y)\mid\mathcal{G}\right] & = & \lim_{n\rightarrow\infty}E\left[f_{n}(X,Y)\mid\mathcal{G}\right]\\ & = & \lim_{n\rightarrow\infty}g_{n}(X)\\ & = & g(X). \end{eqnarray*}

Finally, let $f:\mathbb{R}^{2}\rightarrow\mathbb{R}$ be a Borel function such that $E\left[|f(X,Y)|\right]<\infty$ and for each $x\in\mathbb{R}$, $E\left[|f(x,Y)|\right]<\infty$. Define $g:\mathbb{R}\rightarrow\mathbb{R}$ by $g(x)=E\left[f(x,Y)\right]$. Write $f=f^{+}-f^{-}$, where $f^{+}=\max(f,0)$ and $f^{-}=\max(-f,0)$. Define $g^{+}:\mathbb{R}\rightarrow[0,\infty]$ and $g^{-}:\mathbb{R}\rightarrow[0,\infty]$ by $g^{+}(x)=E\left[f^{+}(x,Y)\right]$ and $g^{-}(x)=E\left[f^{-}(x,Y)\right]$. Observe that $E\left[|f^{+}(x,Y)|\right]\leq E\left[|f(x,Y)|\right]<\infty$ and similarly $E\left[|f^{-}(x,Y)|\right]<\infty$, so $g^{+}$ and $g^{-}$ are actually real-valued. Moreover, \begin{eqnarray*} g(x) & = & E\left[f^{+}(x,Y)\right]-E\left[f^{-}(x,Y)\right]\\ & = & g^{+}(x)-g^{-}(x). \end{eqnarray*} Finally, \begin{eqnarray*} E\left[f(X,Y)\mid\mathcal{G}\right] & = & E\left[f^{+}(X,Y)\mid\mathcal{G}\right]-E\left[f^{-}(X,Y)\mid\mathcal{G}\right]\\ & = & g^{+}(X)-g^{-}(X)\\ & = & g(X). \end{eqnarray*}