Lipschitz Smoothness, Strong Convexity and the Hessian

Solution 1:

I assume that your question concerns convex functions only; without convexity much of it would be false.

Question 2: strictly speaking, being Lipschitz smooth ($C^{1,1}$) does not imply $\nabla^2 f$ exists. But the statement is true if we interpret $\nabla^2 f\preceq LI$ as holding almost everywhere. Indeed, $\nabla^2 f$ is a positive semidefinite matrix, so having $\nabla^2 f\preceq LI$ a.e. is equivalent to $\nabla^2 f\in L^\infty$. And it is well known that having $L^\infty$ derivative is equivalent to being Lipschitz; thus $$\nabla^2 f\in L^\infty \iff \nabla f\in C^{0,1} \iff f\in C^{1,1}$$

Question 3: You misremembered. The correct inequality characterizing $\alpha$-strong convexity is $$f(x+y) \ge f(x) + y^\top\nabla f(x) + \frac{\alpha}{2} \| x - y \|^2 \tag{1}$$ Indeed, (1) is equivalent to saying that the function $g(x)=f(x)-\frac{\alpha}{2} \| x \|^2$ is convex. The latter is equivalent to $\nabla^2 g\succeq 0$, which is $\nabla^2 f\succeq \alpha\, I$.

Question 4. Yes, there is a direct and important relation: a function is strongly convex if and only if its convex conjugate (a.k.a. Legendre-Fenchel transform) is Lipschitz smooth. Indeed, the gradients maps are inverses of each other, which implies that the Hessian of convex conjugate of $f$ is the inverse of the Hessian of $f$ (at an appropriate point). So, a uniform upper bound on $\nabla^2 f$ is equivalent to a uniform lower bound on $\nabla^2 (f^{*}) $, and vice versa. One can also argue without referring to the Hessian (which may fail to exist at some points): the Lipschitz smoothness of $f$, by your item 1, gives us at every $x_0$ a quadratic function $q$ so that $q(x_0)=f(x_0)$ and $f \le q$ everywhere. Taking convex conjugate reverses the order: $q^*\le f^*$; and this means that $f^*$ is strongly convex.

Question 1. The converse is true, but the only proof I see goes through the convex conjugate as described in Q4. Since strong convexity is characterized by the comparison property (1), taking the conjugate gives a matching characterization of Lipschitz smoothness.

Reference: Chapter 5 of Convex functions by Jonathan M. Borwein and Jon D. Vanderwerff.

Solution 2:

Regarding the references, Nesterov's "Introductory Lectures on Convex Optimization" (ISBN: 978-1-4613-4691-3) answers at least three out of the four questions asked here. Specific equations/theorems:

Question 1: See equation (1.2.12) and substitute "y" for "x+y"

Question 2: Theorem 2.1.6. with adjoining proof.

Question 3: As mentioned in the previous answer, the equation was slightly wrong in the original question. But if corrected, this is proven in Theorem 2.1.11.

Lipschitz Smoothness, Strong Convexity and the Hessian

Solution 1:

Solution 2:

Related

Recent Posts