Fenchel dual vs Lagrange dual
Consider the Fenchel dual and the Lagrangian dual.
Are these duals equivalent? In other words, is using one of the these duals (say for solving an optimization), would give the same answer as using the other one?
I think the answer is no, but I am not sure. One reason for saying that, is that, in the Lagrange dual, we have a relatively straightforward way to add the constraints into the objective function. But what about the Fenchel? I have not seen any.
But I have seen some problems in which both of these give same answers. So, I would assume that, on a subset of problems, these two dualities, are exactly the same.
And also, if they are different, how would you choose which one to use on your problem?
Solution 1:
From your comments, I understand that your confusion is about the relation of Fenchel and Lagrange duality. So I will focus on that relation. Particularly, I would like to contradict the comment which stated, that Lagrange and Fenchel duality are two categorically different concepts.
Consider the primal problem $$\nu(0) = \inf_{x \in \mathbb R^n} \varphi(x,0) = \inf_{x \in \mathbb R^n} f(x)\text,$$ where $\varphi(x,u)$ is the perturbation function of $f$ with $\varphi(x,0) = f(x)$.
Using Fenchel conjugates, one can show that $$\nu^{\ast\ast}(0) = \sup_{p\in\mathbb R^m} -\varphi^\ast(0,p)\text,$$ which is known as the dual problem.
The relation $\nu^{\ast\ast} \leq \nu$ is known as weak duality, and it is immediately seen from $\nu^{\ast\ast}$ being the Fenchel biconjugate of $\nu$. The difference $\nu(0) - \nu^{\ast\ast}(0)$ is the famous duality gap. As you probably know, strong duality refers to the special case $\nu^{\ast\ast}=\nu$, which occurs if $f$ is convex and some other mathematical requirements are satisfied.
Why am I recalling all this? Because you should note, that all these concepts exist despite that we haven't even mentioned Lagrangians yet. These concepts are part of the Fenchel duality.
Now, consider the special case that $$f(x) = f_0(x) + g(F(x))$$ with the perturbation function $$\varphi(x,u) = f_0(x) + g(F(x) + u)\text,$$ such that $\varphi(x,0) = f(x)$ is still satisfied.
With $f_0 : \mathbb R^n \to \overline{\mathbb R}$, $g : \mathbb R^m \to \overline{\mathbb R}$, and $F : \mathbb R^n \to \mathbb R^m$, where $$\overline{\mathbb R} = \mathbb R \cup \{-\infty, +\infty\}$$ is the extension of the real line, this primal problem implicitly induces the feasibility set $\mathcal G = \{ x \in \operatorname{dom} f_0 : F(x) \in \operatorname{dom} g \}$, where $\operatorname{dom}$ denotes the effective domain, that is the set of arguments to a function, where the function is not infinity-valued.
By calculating the Fenchel conjugate $\varphi^\ast$, we obtain the dual problem $$\nu^{\ast\ast}(0) = \sup_{p \in \mathbb R^m} \inf_{x \in \mathbb R^n} L(x,p) - g^\ast(p) \text,$$ where $$L(x,p) = f_0(x) + \langle F(x), p \rangle$$ is called the Lagrangian.
Now, we specialize further. Consider the prominent case, that our primal problem was to minimize a convex function $f_0(x)$ with $\operatorname{dom} f_0 = \mathbb R^n$ subject to $f_i(x) \leq 0$ for all $i = 1, \dots, k$ and $f_i(x) = 0$ for the remaining i = $k+1, \dots, m$. Using the notation of an indicator function $$\delta_K(p) = \begin{cases} 0 & \text{if } p \in K \\ +\infty & \text{else,} \end{cases}$$ we can write the primal problem like $$\inf_{x \in \mathbb R^n} f_0(x) + \delta_K(F(x))\text,$$ where we set $g = \delta_K$, choose $F^{\mathsf T} = \begin{bmatrix}f_1, \dots, f_m\end{bmatrix}$, and $K = \mathbb R^k_- \times \{0\}^{m-k}$ correspondingly. Since the Fenchel conjugate of the indicator function of a closed convex cone $K$ equals to the indicator function of the polar cone $K^\ast$, that is, $\delta_K^\ast = \delta_{K^\ast}$, the dual problem reads $$\nu^{\ast\ast}(0) = \sup_{p \in \mathbb R^m} \inf_{x \in \mathbb R^n} L(x,p) - \delta_{K^\ast}(p) = \sup_{p \in K^\ast} \inf_{x \in \mathbb R^n} L(x,p)\text.$$
Note that by its formal definition, the the polar cone of $K = \mathbb R^k_- \times \{0\}^{m-k}$ $$K^\ast = \left\{ p \in \mathbb R^m \middle| \langle p,q \rangle \leq 0 \forall q \in K \right\}$$ translates to $$K^\ast = \left\{ p \in \mathbb R^m \middle| p_i \geq 0 \forall i = 1, \dots, k \right\} = \mathbb R^k_+ \times \mathbb R^{m-k} \text,$$ which are precisely the constraints that we have in mind, when we choose the Lagrangian multipliers, namely non-negative factors for the inequalities $f_1, \dots, f_k$ and arbitrary factors for the equality constraints $f_{k+1}, \dots, f_m$. Hence, our dual problem is to maximize $$\inf_{x \in \mathbb R^n} L(x,\left[\begin{smallmatrix}\lambda\\\mu\end{smallmatrix}\right]) = \inf_{x \in \mathbb R^n} f_0(x) + \sum_{i=1}^k \lambda_i f_i(x) + \sum_{i=1}^{m-k} \mu_i f_{k+i}(x)$$ w.r.t. the dual variables $\lambda \in \mathbb R^k$ and $\mu \in \mathbb R^{m-k}$, and subject to $\lambda \geq 0$.
So ultimately, we obtain the famous Lagrangian dual problem as a special case of Fenchel duality. To put it more precisely in view of your original question: Lagrangian duality is a result of Fenchel duality, the latter being a more general concept. Thus, of course, they are not equivalent.