What is the intuition behind Slater's condition in optimization? (And other constraint qualifications.)

Solution 1:

Slater's condition relates to existence of Lagrange multipliers in a convex program. Lagrange multipliers exist when there is a nonvertical supporting hyperplane that passes through a point on the boundary of a related convex set. Intuitively, nonvertical hyperplanes can only fail to exist when constraints of the convex program can only be achieved at "edge" type points of the set...Slater ensures constraints can be achieved at "non-edge" type points.

Discussions with pictures can be found in various places, such as Convex Analysis and Optimization by Bertsekas, Nedic, and Ozdaglar. The Slater result is very simple to derive and below I summarize a math derivation that may also give intuition.


The convex program

Mathematically we have a general convex program:

\begin{align} \mbox{Minimize:} \quad & f(x) \\ \mbox{Subject to:} \quad & g_i(x) \leq 0 \quad , \forall i \in \{1, \ldots, k\} \\ & x \in \mathcal{X} \end{align} where $\mathcal{X} \subseteq \mathbb{R}^n$ is a convex set and $f, g_1, ..., g_k$ are convex functions defined over $\mathcal{X}$. Let us suppose there is (at least one) $x^* \in \mathcal{X}$ that is an optimal solution to the above convex program.

Lagrange multipliers

Define the set $\mathcal{A}$: $$ \mathcal{A} = \{(y_0, y_1, ..., y_k) : y_0 \geq f(x), y_1 \geq g_1(x), ..., y_k \geq g_k(x) \mbox{ for some $x \in \mathcal{X}$} \}$$

It can be shown that $\mathcal{A}$ is a convex set and the point $(f(x^*), 0, 0, ..., 0)$ is a boundary point of the set. By the hyperplane separation theorem it follows there are real numbers $\theta_0, \theta_1, ..., \theta_k$, not all zero, such that $$ \theta_0y_0 + \sum_{i=1}^k \theta_i y_i \geq \theta_0f(x^*) \quad \forall (y_0, ..., y_k) \in \mathcal{A} $$ By nature of the set $\mathcal{A}$ (namely, any point that dominates a point in $\mathcal{A}$ is also in $\mathcal{A}$), it is easy to show the above can only hold if $\theta_i \geq 0$ for all $i \in \{0, 1, ..., k\}$. Since for any $x \in \mathcal{X}$ we have $(f(x), g_1(x), ..., g_k(x)) \in \mathcal{A}$, a special case of the above inequality is: $$ \theta_0f(x) + \sum_{i=1}^k \theta_i g_i(x) \geq \theta_0f(x^*) \quad \forall x \in \mathcal{X} \quad (Eq. 1)$$

We say the hyperplane is nonvertical if $\theta_0 \neq 0$, in which case we can define $\mu_i = \theta_i/\theta_0 \geq 0$ for all $i \in \{1, ..., k\}$ and we call these Lagrange multipliers because they are nonnegative and: $$ \theta_0 \neq 0 \implies f(x) + \sum_{i=1}^k \mu_i g_i(x) \geq f(x^*) \quad, \forall x \in \mathcal{X}$$ Assuming $\theta_0 \neq 0$, the above inequality implies that $x^*$ is a minimizer of $f(x) + \sum_{i=1}^k \mu_ig_i(x)$ over all $x \in \mathcal{X}$, the minimum value is $f(x^*)$, and $\mu_i g_i(x^*)=0$ for all $i \in \{1, ..., k\}$.

Slater's condition

Slater's condition: Suppose there is an $s \in \mathcal{X}$ such that $g_i(s) < 0$ for all $i \in \{1, ..., k\}$. (So all constraints can be achieved with positive slackness.)

Claim: If Slater's condition holds, then $\theta_0 \neq 0$ and hence Lagrange multipliers exist.

Proof: Assume $\theta_0=0$ and plug $s \in \mathcal{X}$ into Eq. 1 to get a contradiction (recalling that not all $\theta_i$ are zero). $\Box$

I do not fill in the details here since I sometimes give these details as a homework exercise (Exercise VIII-B.8 on page 40 here):

http://ee.usc.edu/stochastic-nets/docs/network-optimization-notes.pdf

Fig. 19 on page 55 of those notes gives a simple counter-example where a nonvertical hyperplane does not exist, and this is also useful to get intuition on Slater's condition. These examples are standard and similar ones can be found in the Bertsekas book.

Further intuition

Going back to the set $\mathcal{A}$ and the boundary point $(f(x^*), 0, 0, ..., 0)$, intuitively we see that if Slater's condition holds then any vertical hyperplane that passes through this boundary point would chop $\mathcal{A}$ into two pieces (with points of $\mathcal{A}$ lying strictly on one side of the vertical hyperplane, and other points lying strictly on the other side). Hence, if Slater's condition is true, then a vertical hyperplane would fail to be a supporting hyperplane. [Again refer to Fig. 19 in the above notes to visualize.]