Definition of Conditional Probability by Measure Theory

I was reading a book on information theory and entropy by Robert Gray, when I saw the following definition of conditional probability:

Given a probability space $(\Omega,\mathcal{B}, P)$ and a sub-$\sigma$-field $\mathcal{G}$, for any event $H\in\mathcal{B}$ the conditional probability $m(H\text{ }|\text{ }\mathcal{G})$ is defined as any function , say $g$, which satisfies the two properties:

(1) $g$ is measurable with respect to $\mathcal{G}$

(2) $\displaystyle\int_{G}ghdP=m(G\bigcap{}H)$; all $G\in\mathcal{G}$

I am quite confused with this definition since it is very different from the definition through joint probability of events.

I understand what measurable function, sub-$\sigma$-field and probability space are, and I'm guessing that the author is trying to definie the measure $m$ through the measurable function $g$, but I don't quite understand what the second requirement is saying. Especially, what does that h in $\displaystyle\int_{G}ghdP$ refer to? it just jumped out of nowhere in the book, so I'm suspecting that it may have some conventional meaning?

I'd appreciate it a lot if someone can help. Thank you!!


Solution 1:

The starting point for abstract measure theoretic conditional probability is conditional expectation. Essentially, one uses the identity $P(A)=\mathbb{E}(1_A)$.

Now let $(\Omega,\mathcal{B},P)$ be a probability space, $f$ a random variable and $\mathcal{G}$ a sub-$\sigma$-algebra of $\mathcal{B}$. The conditional expectation of $f$ with respect to $\mathcal{G}$ is a $\mathcal{G}$-measurable function $\mathbb{E}_\mathcal{B}$ such that for all $G\in\mathcal{G}$ $$\int_G \mathbb{E}_\mathcal{B}~dP=\int_G f~dP.$$ The notion is not very intuitive, but the idea is the following: Since $\mathbb{E}_\mathcal{B}$ is $\mathcal{G}$-measurable, it uses only the information in $\mathcal{G}$. The integral condition says that $\mathbb{E}_\mathcal{B}$ "averages $f$ out" over sets in $\mathcal{G}$.

Now if we want to calculate the conditional probability of the event $H\in\mathcal{B}$ with respect to the sub-$\sigma$-algebra $\mathcal{G}$, we simply take the conditional expectation of the indiacator function $1_H$. Then, a conditional probability of $H$ with respect to $\mathcal{G}$ is a $\mathcal{G}$-measurable function $\mathbb{P}^H_\mathcal{G}$ such that for all $G\in\mathcal{G}$ $$\int_G \mathbb{P}^H_\mathcal{G}~dP=\int_G 1_H~dP.$$ Since $\int_G 1_H~dP=P(H\cap G)$, this can be rewritten as $$\int_G \mathbb{P}^H_\mathcal{G}~dP=P(H\cap G).$$

This is fairly standard material, so I assume the author made simply some typos. The $h$ is superflous and the $m$ should be $P$.

Solution 2:

I think gh means g(h) that is g evaluated on the event h. Here h is taking the role of H. I think this is just a fancy way of say P(G⋂H) = P(H|G)P(G) for all G in script G. It is an integral because you are integrating over all values that G takes on i.e. ∫P(H|G=x)dP(x) = P(G⋂H).