If $(\nabla f(x)-\nabla f(y))\cdot(x-y)\geq m(x-y)\cdot(x-y)$, why is $f$ convex?

I was reading on wikipedia that a strongly convex function is also strictly convex.

I say that a function $f\colon\mathbb{R}^n\to\mathbb{R}$ is convex if $$ f(\lambda x+(1-\lambda)y)\leq\lambda f(x)+(1-\lambda)f(y) $$ for all $x,y\in\mathbb{R}^n$, and $\lambda\in[0,1]$. If $f$ is continuously differentiable, and strongly convex, so that there exists $m>0$ such that $$ (\nabla f(x)-\nabla f(y))\cdot(x-y)\geq m(x-y)\cdot(x-y) $$ for all $x,y\in\mathbb{R}^n$, how can you recover that $f$ is convex?

Writing $x=(x_1,\dots,x_n)$ and $y=(y_1,\dots,y_n)$, I could only interpret the above inequality like $$ \sum_{i=1}^n\frac{\partial f}{\partial e_i}(x-y)(x_i-y_i)\geq m\sum_{i=1}^n(x_i-y_i)^2. $$

Is there an explicit way to deduce convexity from strong convexity?


Solution 1:

$$ f(\lambda x + (1-\lambda)y)\le \lambda f(x)+(1-\lambda)f(y) $$ is equivalent to $$ \lambda \,\left( f(\lambda x + (1-\lambda)y) - f(x)\right) \le (1-\lambda)\,\left( f(y)-f(\lambda x + (1-\lambda)y) \right) $$ This inequality can be written as

$$ \lambda \int_0^1\nabla f(x+s(1-\lambda)(y-x))\cdot(y-x) (1-\lambda) ds\le \\ \le(1-\lambda)\int_0^1 \nabla f(y+s\,\lambda (x-y))\cdot(y-x)\lambda ds $$ Or equivalently $$ \lambda (1-\lambda)\int_0^1\left(\nabla f(u)-\nabla f(v)\right)\cdot(y-x)\,ds \le 0 $$

where $u=x+s(1-\lambda)(y-x)$ and $v=y+s\,\lambda (x-y)$. Now note that $$ u-v=x+s(1-\lambda)(y-x) - y-s\,\lambda (x-y)=(x-y)(1-s) $$ So the previous inequality can be written as $$ \lambda (1-\lambda)\int_0^1\left(\nabla f(u)-\nabla f(v)\right)\cdot(v-u)\,\frac{1}{1-s}ds \le 0 $$ which is true because $f$ is strongly convex.

Solution 2:

Here's some intuition. Suppose $f$ is $C^2$ and let $Hf(x)$ be the hessian of $f$ at x. If $y$ is near $x$, then $\nabla f(y) - \nabla f(x) \approx Hf(x)(y-x)$, so \begin{align} \langle Hf(x) (y-x),y-x\rangle &\approx \langle \nabla f(y) - \nabla f(x),y-x \rangle \\ & \geq m \| y - x \|_2^2. \end{align}

This suggests that $Hf(x) \succeq mI$ and in particular that $Hf(x) \succeq 0$ (for all $x$).

We know that this last condition implies $f$ is convex.

Solution 3:

In my opinion, strong convexity implies convexity due to its definition. Indeed, recall that $f:\mathbb{R}^n\rightarrow\mathbb{R}$ is strongly convex if there exists $m>0$ such that $$ f[tx+(1-t)y]\leq tf(x)+(1-t)f(y)-\frac{mt(1-t)}{2}\|x-y\|^2 $$ for all $t\in[0, 1]$ and $x, y\in\mathbb{R}^n$ (definition of strongly convex). Since $$ \frac{mt(1-t)}{2}\|x-y\|^2\geq 0, $$ it follows that $$ f[tx+(1-t)y]\leq tf(x)+(1-t)f(y) $$ and so $f$ is convex.

Remark. The fact that strong convexity implies convexity does not need the differentiability of $f$ as in the proof of @Pocho la pantera and @littleO.