Conditional expectation is $T$-invariant in a measure preserving system.

This is not true. If $f=I_B$ with $B \in \mathcal B$ the LHS is $I_B$ and RHS is $I_{T^{-1}B}$. The stated result is true if $f$ is $T-$ invariant: i.e. $f\circ T=f$ a.e..


Kavi Rama Murthy's answer demonstrates that the formula is not correct in general. On the other hand, categorical principles point to the correct version of it. In a slightly more general setting, the correct formula is:

Lemma: Let $T:(X,\mathcal{B}(X))\to (Y,\mathcal{B}(Y))$ be a measurable map between measurable spaces. Then for any probability measure $\mu$ on $X$, for any sub-$\sigma$-algebra $\mathcal{A}$ of $\mathcal{B}(Y)$, and for any measurable $f:Y\to \mathbb{R}$, we have:

$$\overleftarrow{T}\left(\mathbb{E}_{\overrightarrow{T}(\mu)}(f\,|\, \mathcal{A})\right) = \mathbb{E}_{\mu}\left(\overleftarrow{T}(f)\,\left|\, \overleftarrow{T}(\mathcal{A})\right.\right).$$

Here $\mathbb{E}_\nu(g\,|\, \mathcal{B})$ is the conditional measure of the measurable function $g$ conditioned on the $\sigma$-algebra $\mathcal{B}$ w/r/t the probability measure $\nu$; the third $\overleftarrow{T}$ is pullback of $\sigma$-algebras, the first $\overleftarrow{T}$ is pullback that transforms $\mathcal{A}$-measurable functions to $\overleftarrow{T}(\mathcal{A})$-measurable functions, the second $\overleftarrow{T}$ is pullback that transforms $\mathcal{B}(Y)$-measurable functions to $\mathcal{B}(X)$-measurable functions, and the $\overrightarrow{T}$ is pushforward acting on measures. In short, all decorated $T$'s are maps functorially induced by $T$ ($\overleftarrow{T}$ is typically denoted by $T^{-1}$ or $T^\ast$ and $\overrightarrow{T}$ is typically denoted by $T_\ast$). (I'll leave it as an exercise that all this is indeed syntactic). As a diagram it is arguably easier to convey what is going on:

enter image description here


The proof is essentially a version of the algebraic manipulations the OP displays. The important point is that by the characterizing property of conditional expectations what needs to be shown is that

$$\forall A\in\mathcal{A}: \int_{\overleftarrow{T}(A)} \overleftarrow{T}\left(\mathbb{E}_{\overrightarrow{T}(\mu)}(f\,|\, \mathcal{A})\right) \, d\mu = \int_{\overleftarrow{T}(A)} \overleftarrow{T}(f)\, d\mu.$$

It is straightforward to adapt the formula to the special case of $T:X\to X$ measure preserving. Concisely (using the OP's notation), it says:

$$E(f\,|\, \mathcal{A})\circ T= E(f\circ T\,|\, T^{-1}(\mathcal{A})).$$