Conditional Expectation of Functions of Random Variables satisfying certain Properties

Suppose that we have a probability space $(\Omega, \mathcal{F}, P)$. Let $X,Y$ be real-valued random variables defined on this space, and let $\mathcal{H} \subset \mathcal{F}$ be a sub-sigma-algebra.

Suppose $X$ is $\mathcal{H}$-measurable (i.e, $X^{-1}(B) \in \mathcal{H}$ for all Borel $B \subset \mathbb{R}$).

Also suppose that $Y$ is independent of $\mathcal{H}$ (which implies that $X$ and $Y$ are independent).

Then is it true that for any Borel-measurable function $g: \mathbb{R}^2 \to \mathbb{R}$, we have that $\mathbb{E}(g(X,Y)|\mathcal{H})=\mathbb{E}(g(X,Y)|X)$?

Observations: It seems to be true for functions of the form $g(x,y)=f(x)h(y)$, because $\mathbb{E}(f(X)h(Y)|\mathcal{H}) = \mathbb{E}(h(Y)) \cdot f(X)=\mathbb{E}(f(X)h(Y)|X)$ by independence of $Y$ to $X$ and $\mathcal{H}$. But I can't seem to prove it for the general case. Maybe we can approximate arbitrary $g$ by functions of this form from below, and then use MCT?

Is it possible to show that any measurable $g: \mathbb{R}^2 \to \mathbb{R}$ can be written as the (upward) limit of linear combinations of functions of the form $\chi_{A \times B}$, for Borel $A,B \subset \mathbb{R}$? Because then we can just apply MCT and use the preceding comment, and we're done. (The $\chi_E$ denotes characteristic function of $E$.)


Solution 1:

Hints

  1. Show that the claim holds for $g(x,y) = 1_B(x) 1_C(y)$ where $B,C \in \mathcal{B}(\mathbb{R})$ are Borel sets.
  2. Show that $$\mathcal{D} := \{D \in \mathcal{B}(\mathbb{R}^2); \text{claim holds for} \, g(x,y) = 1_D(x,y)\}$$ is a Dynkin system. Conclude that $\mathcal{D} = \mathcal{B}(\mathbb{R}^2)$.
  3. Use Beppo Levi to extend the statement from simple functions (aka elementary functions) to non-negative measurable functions $g$.
  4. Conclude.

Solution 2:

I think I figured it out. Can someone please confirm that the proof is correct? Is this the special case of a more general result??

It suffices to prove the claim when $g=\chi_E$, for Borel sets $E \subset \mathbb{R}^2$. The reason is that any $g$ can be written as the monotone limit of linear combinations of such functions, and then we can apply MCT.

First consider the case when $E=A \times B$ for Borel sets $A,B \subset \mathbb{R}$. Then it is true. Indeed, it is easy enough to check directly that if $g=\chi_{A \times B}$, then $g(X,Y) = \chi_{X \in A} \cdot \chi_{Y \in B}$, and so it follows that $\mathbb{E}(g(X,Y)|\mathcal{H})=P(Y \in B) \cdot \chi_{X \in A} = \mathbb{E}(g(X,Y)|X)$.

Let $\mathcal{I}= \{ A \times B : A,B \subset \mathbb{R} \text{ are Borel} \}$. Then $\mathcal{I}$ is a $\pi$-system that generates the Borel sets in $\mathbb{R}^2$. Therefore the Dynkin-system generated by $\mathcal{I}$ is the same as the sigma-algebra generated by $\mathcal{I}$, i.e, the Borel sets in $\mathbb{R}^2$.

Now it is easy enough to check that $\{ F \subset \mathbb{R}^2 : F \text{ is Borel and } \mathbb{E}(\chi_F (X,Y) | \mathcal{H}) = \mathbb{E} (\chi_F(X,Y)|X) \}$ is itself a Dynkin-system containing $\mathcal{I}$, from which the result follows for $g=\chi_F$, and then we extend as stated above...