Two notions of total variation norms

I found these two definitions of the total variation norm for probability measures on $(X,\mathcal{F})$:

$$ \left \|\mu- \nu \right \|_{TV} = \sup_{\text{$f:X \rightarrow [-1,1]$ measurable}} \left \{ \int_X fd\mu - \int_X fd\nu \right \}.$$

and the second one here:$$\sup_{ A\in \mathcal{F}}\left|\mu(A)-\nu(A)\right|$$ I can't really see how they are the same? Is there an obvious way to see this?


Solution 1:

First of all, note that it should read

$$\|\mu-\nu\|_{\text{TV}} = \frac{1}{2} \sup_{f: X \to [-1,1] \, \text{msb}} \left\{ \int f \, d\mu - \int_X f \, d\nu \right\},$$

otherwise equality does not hold. (Simply consider $\mu = \delta_x$, $\nu = \delta_y$ for $x \neq y$.)


Proof

  1. Let $A \in \mathcal{F}$, then $$\mu(A^c)-\nu(A^c) = (1-\mu(A)) - (1-\nu(A)) = \nu(A)-\mu(A)$$ since $\mu$, $\nu$ are probability measures. If we set $f := 1_A - 1_{A^c}$, we get $$\int_X f \, d\mu - \int_X f \, d\nu =(\mu(A)-\mu(A^c))-(\nu(A)-\nu(A^c)) = 2\mu(A)-2 \nu(A).$$ Hence, $$\mu(A)-\nu(A) \leq \frac{1}{2} \sup_{g: X \to [-1,1] \, \text{msb}} \left\{ \int g \, d\mu - \int_X g \, d\nu \right\}.$$ Applying the same argumentation to $-f$ yields $$|\mu(A)-\nu(A)| \leq \frac{1}{2} \sup_{g: X \to [-1,1]\, \text{msb}} \left\{ \int g \, d\mu - \int_X g \, d\nu \right\}.$$
  2. Now let $f:X \to [-1,1]$ be measurable. By the Sombrero lemma, there exists a sequence of simple functions $(f_n)_n$ such that $|f_n| \leq |f| \leq 1$ and $f_n \to f$. Therefore, it suffices to show the claim for any simple function $f$ of the form $$f = \sum_{j=1}^n c_j \cdot 1_{A_j}$$where $|c_j| \leq 1$ and $A_j \in \mathcal{F}$. Without loss of generality, we may assume that the sets $A_j$, $j=1,\ldots,n$, are disjoint. Set $\delta_j := \mu(A_j)-\nu(A_j)$. Then, $$\begin{align*} \int f \, d\mu - \int f \, d\nu &= \sum_{j=1}^n c_j (\mu(A_j)-\nu(A_j)) \\ &\leq \sum_{j:\delta_j \geq 0} \mu(A_j)-\nu(A_j) - \sum_{j:\delta_j \leq 0} \mu(A_j)-\nu(A_j) \\ &= 2 \sum_{j:\delta_j \geq 0} \mu(A_j)-\nu(A_j) \\ &= 2 \mu\left(\bigcup_{j:\delta_j \geq 0} A_j \right) - 2 \nu\left(\bigcup_{j:\delta_j \geq 0} A_j \right).\end{align*}$$ In the last step, we used the fact that the sets $A_j$ are disjoint. This finishes the proof.