Convergence of steepest descent dynamics under weak convexity assumption

Let $f\colon \mathbb R^d \to \mathbb R$ be smooth and assume that

$f$ is a convex function (its Hessian is positive semi-definite);
$f$ is bounded from below, in particular $\inf_{z \in \mathbb R^d} f(z) > - \infty$.

Let $x(t)$ denote the solution to the differential equations $$ \dot x(t) = - \nabla f\bigl(x(t)\bigr), \qquad x(0) = x_0. $$ Question: Are there classical results on gradient descent stating that $$ \lim_{t\to\infty}f\bigl(x(t)\bigr) = \inf_{z \in \mathbb R^d} f(z) =: I. \qquad \qquad (1) $$ for all initial conditions $x_0 \in \mathbb R^d$? Note that $f\bigl(x(t)\bigr)$ may converge even if $x(t)$ does not converge, for example in the case $f(x) = e^{-x}$.

Proof of convergence when $f$ is coercive, that is to say when $f(z) \to \infty$ in the limit as $|z| \to \infty$. It is sufficient to show that, for all $\varepsilon > 0$, the dynamics $x(t)$ reaches the set $A_{\varepsilon} := \{z: f(z) \leq I + \varepsilon \}$ eventually. These sets are convex, and also bounded by coercivity. Any minimizing sequence has a convergent subsequence by compactness of $A_{\varepsilon}$, and so there exists a (possibly non-unique) minimizer $z_*$ of $f$. We denote $$ R_{\varepsilon} = \sup_{z \notin A_{\varepsilon}} \|z - z_*\| > 0. $$ Let us fix $\varepsilon > 0$ and show that the norm $\|\nabla f\|$ is bounded from below by a strictly positive constant $C_{\varepsilon}$ uniformly for $z \notin A_{\varepsilon}$. Fix $z \notin A_\varepsilon$ and let $y$ denote the intersection between the boundary of $A_{\varepsilon}$ and the segment linking $z_*$ and $z$. Since $f(y) = I + \varepsilon$, we have by convexity $$ \langle \nabla f(y), z_* - y \rangle \leq f(z_*) - f(y) = - \varepsilon, $$ Since $f$ is convex, it also holds that $$ \langle \nabla f(z) - \nabla f(y), z - y \rangle \geq 0, $$ and so $$ \|\nabla f(z) \| \|z - y\| \geq \langle \nabla f(y), z - y \rangle = \frac{\|z-y\|}{\|y-z_*\|} \langle \nabla f(y), y - z_* \rangle \geq \frac{\|z-y\|}{\|y-z_*\|} \varepsilon \geq \|z-y\| \frac{\varepsilon}{R_{\varepsilon}}. $$ Canceling $\|z-y\|$ on both sides, we obtain the desired lower bound. The conclusion then easily follows from the fact that $$ \frac{d}{dt} \bigl(f(x(t))\bigr) = \bigl\langle \nabla f\bigl(x(t)\bigr), \dot x(t) \bigr\rangle = - \| \nabla f\bigl(x(t)\bigr) \|^2. $$

Proof that (1) holds if $d = 1$. If $f$ attains its infimum the proof is not difficult, so we suppose that $f$ does not have a minimizer. In this case either $f'(z) > 0$ for all $z \in \mathbb R^d$ or $f'(z) < 0$ for all $z \in \mathbb R^d$. Without loss of generality, assume the latter case holds. Then clearly $I = \lim_{z \to \infty} f(z)$ and $x(t)$ is a strictly increasing function of $t$. It is impossible that $x(t) \to x_*$ for some $x_* < \infty$, because $\dot x$ can be bounded from below uniformly in a neighborhood of $x_*$. Consequently, it must hold that $x(t) \to \infty$ and so $\lim_{t \to \infty} f(x(t)) = I$.

Therefore, the case that is interesting is when $f$ is non-coercive with $d > 1$. Then either $f$ attains its infimum or not, and I am particularly interested in the latter situation: $f$ does not have a minimizer.

Solution 1:

In the end, a colleague found the following simple solution. Assume for contradiction that $$ \lim_{t \to \infty} f\bigl(x(t)\bigr) = I + 2\varepsilon $$ for some $\varepsilon > 0$, and let $x_*$ be such that $f(x_*) \leq I + \varepsilon$. Then, using the convexity of $f$, we have $$ \frac{1}{2} \frac{\mathrm d}{\mathrm dt} \bigl(x(t)-x_*\bigr)^2 = - \Bigl\langle \nabla f\bigl(x(t)\bigr), x(t) - x_* \Bigr\rangle \leq \bigl(f\bigl(x_*\bigr) - f\bigl(x(t)\bigr)\bigr) \leq - \varepsilon, $$ and so $\bigl(x(t)-x_*\bigr)^2 \to -\infty$, which is impossible.

Convergence of steepest descent dynamics under weak convexity assumption

Solution 1:

Related

Recent Posts