Convexity and equality in Jensen's inequality

Your argument does not seem to be correct. You wrote:

To see this, suppose $f$ were not constant (on all but a set of measure 0). Namely, suppose $X = N\cup Y\cup Z$ with $N,Y$ and $Z$ pairwise disjoint, $\mu(Y) > 0 < \mu(Z)$, and $\mu(N) = 0$ such that $f(y) = a$ for all $y\in Y$ and $f(z) = b$ for all $z\in Z$, with $a\neq b$.

This would mean that $f$ has only two values, $a$ and $b$. But non-constant functions can have many various values, take just $f(x)=x$.

EDIT: See OP's comments bellow - it seems that I just misunderstood his proof strategy.


Hint 1: Work with the constant $c=\int_X f(x) \,\mathrm{d}\mu(x)$. What can you say if you assume that $\{x;f(x)\ne c\}$ has non-zero measure?

Hint 2: Simply try to go through the proof of Jensen's inequality and on the place where convexity of $e^x$ is used, use the fact that it is strictly convex.


Proof: Suppose that $$ \exp\left(\int_X f(x)d\mu(x)\right) = \int_X e^{f(x)}d\mu(x). \qquad (1)$$ We want to show that if equality holds then $f$ is constant.

Let us denote $c=\int_X f(x) \,\mathrm{d}\mu(x)$. The above inequality can be now rewritten as $e^c = \int_X e^{f(x)} \,\mathrm{d}\mu(x)$

Suppose that $f$ is not constant, i.e. $f(x)\ne c$ on a set of positive measure. Then both sets $A=\{x; f(x)>c\}$ and $B=\{x; f(x)<c\}$ must have positive measure. (Otherwise we would get contradiction with the definition of $c$ as the mean value of $f$.)

For any $t\ne c$ we have $e^t>e^c+e^c(t-c)$, since the graph of $e^x$ lies above the tangent line at the point $c$. (The factor $e^c$ is the slope at the point $c$.) This means $$e^t-e^c>e^c(t-c).$$

Thus for any $x\in A\cup B$ we have strict inequality $e^{f(x)}-e^c>(f(x)-c)e^{c}$. For any $x$ we have $e^{f(x)}-e^c \ge (f(x)-c)e^c$. Integrating gives $$\int_X e^{f(x)} \,\mathrm{d}\mu(x) - e^c > e^c \left(\int_X f(x) \,\mathrm{d}\mu(x) -c \right) = 0,$$ contradicting the equality (1).

The same argument works for any strictly convex function instead of $e^x$, see e.g. Lieb, Loss: Analysis, p.45. (Perhaps my answer would be clearer and simpler if I worked with arbitrary strictly convex function. I should have thought about this sooner...)