Lipschitz implies bounded gradient

Assume $f:\mathbb{R}^n \to \mathbb{R}$ is convex, and $L$-Lipschitz, so $|f(x)-f(y)|\leq L\|x-y\|$. I would like to show that $\|\nabla f(x)\|\leq L$.

In one dimension this is a straightforward consequence of the fact that convexity implies $f(y)-f(x)\geq f'(x) (y-x), \forall x, y \in \mathbb{R}$, but I'm having trouble translating this to several variables (in particular, Cauchy-Schwarz is working on the opposite direction!). If it helps, I don't mind assuming the domain of $f$ is a compact $[a,b]^n$.


Solution 1:

Here's how to do it if $f$ is defined on all of $\mathbb{R}^n$: If $\nabla f(x)=0$, we're done, otherwise, take $y=x+\nabla f(x)$. Then by convexity and Lipschitzness we have $$L\|\nabla f(x)\| =L \|x-y\|\geq | f(y)-f(x)|\geq \left|\left<\nabla f(x),\nabla f(x)\right>\right|=\|\nabla f(x)\|^2$$ which gives $\|\nabla f(x)\|\leq L$.

If $f$ is defined on a convex set and $x$ is an interior point, the same argument but with $y=x+\eta\nabla f(x)$ for some small $\eta>0$ gives us the same bound.

The annoying case is when $x$ is a point where the gradient is defined but moving in the direction of $\nabla f(x)$ takes us out of the convex set $K$. This can happen when $x$ is on the boundary of $K$ and yet we can approach it from each coordinate direction from within $K$ to get a one-sided partial derivative. Then, by convexity of the set, observe that either $\nabla f(x)$ or $-\nabla f(x)$ points in a direction where there are points from $K$. If it's $\nabla f(x)$, we already saw how to deal with it. So suppose that for all $\eta>0$ small enought, $x-\eta\nabla f(x)\in K$. Then we use the coordinate-free formulation of the gradient, which says that $$\lim_{\|x-y\|\to0} \frac{f(y)-f(x)-\left<\nabla f(x),y-x\right>}{\|y-x\|} =0$$ which in our case gives $$ \lim_{\eta\to 0} \frac{f(x-\eta\nabla f(x))-f(x) +\eta\|\nabla f(x)\|^2}{\eta\|\nabla f(x)\|} = 0$$ and thus $$ \|\nabla f(x)\| = \lim_{\eta\to0}\frac{f(x-\eta\nabla f(x))-f(x)}{\eta\|\nabla f(x)\|}$$ and this is a limit of quantities each of which is $\leq L$.