If $f$ is continuous and $\,f\big(\frac{1}2(x+y)\big) \le \frac{1}{2}\big(\,f(x)+f(y)\big)$, then $f$ is convex [duplicate]

Let $\,\,f :\mathbb R \to \mathbb R$ be a continuous function such that $$ f\Big(\dfrac{x+y}2\Big) \le \dfrac{1}{2}\big(\,f(x)+f(y)\big) ,\,\, \text{for all}\,\, x,y \in \mathbb R, $$
then how do we prove that $f$ is convex that is $$ f\big(tx+(1-t)y\big)\le tf(x)+(1-t)f(y) , \forall x,y \in \mathbb R , t\in (0,1)? $$
I can prove it for dyadic rational's $t=\dfrac k {2^n}$ and then argue by continuity ; but I would like a more direct proof . Thanks in Advance


Solution 1:

If $f\big(\frac{1}{2}(x+y)\big)\le \frac{1}{2}\big(f(x)+f(y)\big)$, then $$ f\Big(\frac{3}{4}x+\frac{1}{4}y\Big)=f\Big(\frac{1}{2}\Big(\frac{1}{2}(x+y)+x\Big)\Big)\le \frac{1}{2}\Big(f\Big(\frac{1}{2}(x+y)\Big)+f(x)\Big)\le \frac{3}{4}f(x)+\frac{1}{4}f(y). $$ Suitably repeating this argument, one could prove that whenever $m\in\{0,1,2,3,\ldots,2^n\}$, then $$ f\left(\frac{m}{2^n}x+\Big(1-\frac{m}{2^n}\Big)y\right)\le \frac{m}{2^n}f(x) +\Big(1-\frac{m}{2^n}\Big)f(y), \tag{1} $$ This can be done inductively on $n$.

Next observe that any $\lambda\in [0,1]$ can be approximated by rationals of the form $m/2^n$. In fact, $$ q_k=\frac{\lfloor2^k\lambda\rfloor}{2^k}\to\lambda,\quad\text{as $k\to\infty$.} $$ But $(1)$ implies that $$ f\big(q_kx+(1-q_k)y\big)\le q_kf(x)+(1-q_k)f(y), $$ and letting $k\to\infty$ and using the continuity of $f$ at $\lambda x+(1-\lambda)y$ we obtain that $$ f\big(\lambda x+(1-\lambda)y\big)\le \lambda f(x)+(1-\lambda)f(y). $$

Solution 2:

Here's a very analysis flavored proof:

Choose some $x$, $y$, and $\varepsilon>0$. Define $$C_f(\alpha)=\alpha f(x)+(1-\alpha)f(y)-f(\alpha x+(1-\alpha)y)$$ to be a measure of how "convex" $f$ is in the interval $[x,y]$. It is always nonnegative if $f$ is convex. Notice that $C_f$ is continuous. Since the graph of $C_f$ is just a linear transformation of that of $f$, it means that, from the condition on $f$, we get: $$C_f\left(\frac{\alpha+\beta}2\right)\leq\frac{C_f(\alpha)+C_f(\beta)}2.$$ This could be shown algebraically from the definition of $C_f$, but that derivation really nothing to write home about and is tangential to the heart of the proof.

The interesting thing we can do with $C_f$ is to define the following set based on it: $$S_{\varepsilon}=\{\alpha\in[0,1]:C_f(\alpha)>-\varepsilon\}.$$ Notice that it clearly holds that, if $\alpha,\beta\in S_{\varepsilon}$ then $\frac{\alpha+\beta}2\in S_{\varepsilon}$, since the average of two values that are not less than $-\varepsilon$ will still be at least $-\varepsilon$. Furthermore, since $C_f(x)=C_f(y)=0$, by continuity, we have that there exists some $\delta$ such that the sets $[0,\delta)$ and $(1-\delta,1]$ are in $S_{\varepsilon}$. This implies that any value arising from the average of any average of values in those two intervals is in $S_{\varepsilon}$ - so any value in the interval $\left(\frac{1-\delta}2,\frac{1+\delta}2\right)$ is in $S_{\varepsilon}$.

More generally, notice that if we take the set of averages of two intervals $(\alpha,\alpha+\delta)$ and $(\beta-\delta,\beta)$ for $\alpha<\beta$ where both intervals are in $S_{\varepsilon}$, where one has an infimum $\alpha$ and the other a supremum $\beta$ we will get the interval $\left(\frac{\alpha+\beta-\delta}2,\frac{\alpha+\beta+\delta}2\right)$ - an interval of length $\delta$ which is perfectly centered between them. Either this interval will overlap with both (by symmetry) the original intervals, implying that the entire interval $(\alpha,\beta)$ is in $S_{\varepsilon}$, or this interval will be disjoint from the original intervals. If the new interval is disjoint from both, we can form the sets of its averages with $(\alpha,\alpha+\delta)$ or $(\beta-\delta,\beta)$, forming two new intervals of length $\delta$, which may overlap (which would again yield that all of $(\alpha,\beta)$ is covered) or may not, in which case we keep iterating, producing $4$ intervals, and so on. We are bound to eventually stop iterating, since only finitely many disjoint intervals of length $\delta$ can be packed into $(\alpha,\beta)$, thus we will eventually conclude that all of $(\alpha,\beta)\subseteq S_{\varepsilon}$.

Applying the above starting with the statement that $[0,\delta)\cup(1-\delta,1]\subseteq S_{\varepsilon}$ yields that $S_{\varepsilon}=[0,1]$. Thus, $$\forall\varepsilon>0[C_f(x)>-\varepsilon]$$ which implies that $C_f(\alpha)\geq 0$ for all $\alpha\in[0,1]$, and therefore that $f$ is convex.

Solution 3:

Rather than using dyadic rational numbers it is easy to the use the standard rational numbers. Using induction we prove that $$f\left(\frac{x_{1} + x_{2} + \cdots + x_{n}}{n}\right) \leq \frac{f(x_{1}) + f(x_{2}) + \cdots + f(x_{n})}{n}\tag{1}$$ for all positive integers $n$ and all real numbers $x_{i}$. This is done in two stages:

  1. First we prove the relation $(1)$ for all positive integers $n$ of the form $n = 2^{m}$ where $m$ is a positive integer. This is easily done by induction on $m$ and note that the condition given in question corresponds to $m = 1$. I will leave this as an easy exercise in induction.
  2. Next we prove that if $n > 1$ is a positive integer and relation $(1)$ holds for $n$ then it also holds for $n - 1$. This is like running induction backwards.

Combining the two steps we can see that the first step takes care of integers of the form $n = 2^{m}$ and the second step takes care of the remaining values of $n$. To prove the second step we set $$X = \frac{x_{1} + x_{2} + \cdots + x_{n- 1}}{n - 1}$$ so that $$f(X) = f\left(\frac{x_{1} + x_{2} + \cdots + x_{n - 1} + X}{n}\right) \leq \frac{f(x_{1}) + f(x_{2}) + \cdots + f(x_{n - 1}) + f(X)}{n}$$ or $$f(X) \leq \frac{f(x_{1}) + f(x_{2}) + \cdots + f(x_{n - 1})}{n - 1}$$ so that the proof for second step is complete.

Now we show that $$f(tx + (1 - t)y) \leq tf(x) + (1 - t)f(y)\tag{2}$$ for all real $x, y$ and $t \in (0, 1)$. First let $t$ be rational so that $t = m/n$ where $m, n$ are positive integers with $m < n$. In equation $(1)$ we put $$x_{1} = x_{2} = \cdots = x_{m} = x,\, x_{m + 1} = x_{m + 2} = \cdots = x_{n} = y$$ to get $$f\left(\frac{mx + (n - m)y}{n}\right) \leq \frac{m}{n}f(x) + \frac{n - m}{n}f(y)$$ or $$f(tx + (1 - t)y) \leq tf(x) + (1 - t)f(y)$$ If $t \in (0, 1)$ is irrational then there is a sequence $t_{n}$ of rational numbers in $(0, 1)$ which converges to $t$ and thus we have $$f(t_{n}x + (1 - t_{n})y) \leq t_{n}f(x) + (1 - t_{n})f(y)$$ Taking limits as $n \to \infty$ and using continuity of $f$ we get equation $(2)$ for all irrational $t \in (0, 1)$.

Solution 4:

Sketch of Boas' proof -- without using dyadic rationals.

Define $\Delta_{h}f(x) = f(x+h) - f(x)$. Then midpoint convexity implies

$$\Delta_h^2f(x) = f(x+2h)-2f(x+h) + f(x)\geq 0$$ for all $x$ in some interval.

Note that for any positive integer $n$,

$$\Delta_hf(x) = \sum_{i=0}^{n-1} \Delta_{h/n}f(x + ih/n)\\ \Delta_{h/n}\Delta_hf(x) = \sum_{i=0}^{n-1} \Delta_{h/n}^2f(x + ih/n)\geq 0,$$

and

$$\Delta_{h}f(x + h/n) \geq \Delta_hf(x).$$

For any positive integer m,

$$\Delta_{h}f(x + mh/n) \geq \Delta_hf(x+ (m-1)h/n \geq \cdots \geq\Delta_hf(x)$$

If $x$, $x + \delta$, and $h$ are given, we can find sequences of rational numbers $m/n$ such that $mh/n \rightarrow \delta$, and, since $f$ is continuous

$$\Delta_hf(x+\delta) \geq \Delta_hf(x).$$

Hence, $\Delta_hf(x)$ increases with $x$ for each $h$.

Let $0 < m < n$. The average of the first $m$ terms in the chain of inequalities does not exceed the average of the first $n$ terms, which reduces to

$$\frac{f(x + mh/n) - f(x)}{m}\leq \frac{f(x + h) - f(x)}{n}.$$

For $h > 0$ we have

$$\frac{f(x + mh/n) - f(x)}{mh/n}\leq \frac{f(x + h) - f(x)}{h}.$$

If $0 < g < h$ we can choose a sequence of rationals $m/n$ such that $m/n \rightarrow g/h$ and $$\frac{\Delta_gf(x)}{g}\leq \frac{\Delta_hf(x)}{h}.$$

Take $y > x$ and let $z = t_1x + t_2y$, where $t_1 + t_2 = 1$,$h = y-x$, $g= z-x$ so that $y-z = t_1(y-x)$ and $z-x = t_2(y-x)$

Then

$$\frac{f(z)-f(x)}{z-x}\leq \frac{f(y)-f(x)}{y-x},$$

and

$$f(t_1x+t_2y) = f(z) \leq \frac{y-z}{y-x}f(x) + \frac{z-x}{y-x}f(y) = t_1f(x) + t_2 f(y)$$