Dual problem of a quadratic problem

the primal problem is: minimize: $\frac{1}{2}y\cdot Py+e\cdot z $ over $(y,z)\in \mathbb{R}^{k}\times\mathbb{R}^{l}$ subject to $Dy+Ez=w$ where $P\in\mathbb{R}^{k\times k}$ is positive definite, $w\in \mathbb{R}^{n}$,$D\in \mathbb{R}^{m\times k}$, $E\in \mathbb{R}^{m\times l}$ and $e\in \mathbb{R}^{l}$. I want to find the dual problem. However is strugle to handle the $z$ part of the programm as I do not know how to correctly find the infinum of the Lagrangian. Any tips on how to apporach this would be appreciated.


Solution 1:

The primal problem is equivalent to \begin{align} \min_{\substack{y\in\mathbb{R}^{k}\\z\in\mathbb{R}^{\ell}}}\max_{\lambda\in\mathbb{R}^m} \frac{y^{\mathrm{T}}{P}y}{2}+e^{\mathrm{T}}z+\lambda^{T}(Dy+Ez-w)\tag{1} \end{align} The dual problem is given by \begin{align} \max_{\lambda\in\mathbb{R}^m}\min_{\substack{y\in\mathbb{R}^{k}\\z\in\mathbb{R}^{\ell}}} \frac{y^{\mathrm{T}}{P}y}{2}+e^{\mathrm{T}}z+\lambda^{T}(Dy+Ez-w)\tag{2}. \end{align} Weak duality says that the dual value is less than or equal to the primal, e.g. $(1)\ge(2)$. Let $0_{i\times j}$ be an $i\times j$ matrix of all 0s, e.g. the zero matrix in $\mathbb{R}^{i,j}$. We can rewrite (2), the dual, as \begin{align} \max_{\lambda\in\mathbb{R}^m}\min_{\substack{y\in\mathbb{R}^{k}\\z\in\mathbb{R}^{\ell}}} \begin{bmatrix}y^\mathrm{T}&z^{\mathrm{T}}\end{bmatrix}\begin{bmatrix}\frac{1}{2}P& 0_{k\times\ell} \\ 0_{\ell\times k} &0_{\ell\times\ell}\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}+\begin{bmatrix}\lambda^{\mathrm{T}}D&e^{\mathrm{T}}+\lambda^{\mathrm{T}}E\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}-\lambda^{T}w.\tag{3} \end{align} Consider the interior minimization in (3) for a fixed value of $\lambda$, e.g. consider trying to solve the unconstrained problem \begin{align} \min_{\substack{y\in\mathbb{R}^{k}\\z\in\mathbb{R}^{\ell}}} \begin{bmatrix}y^\mathrm{T}&z^{\mathrm{T}}\end{bmatrix}\begin{bmatrix}\frac{1}{2}P& 0_{k\times\ell} \\ 0_{\ell\times k} &0_{\ell\times\ell}\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}+\begin{bmatrix}\lambda^{\mathrm{T}}D&e^{\mathrm{T}}+\lambda^{\mathrm{T}}E\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}-\lambda^{T}w.\tag{4} \end{align} Note that $z$ is unconstrained and only appears in a linear term. Consider what happens when $e^{\mathrm{T}}+\lambda^{\mathrm{T}}E\neq0_{1\times\ell}$. Since choosing any $z$ is possible, for an arbitrarily large $r\in\mathbb{R}$ we can choose $z=-r(e^{\mathrm{T}}+\lambda^{\mathrm{T}}E)$. Thus if $e^{\mathrm{T}}+\lambda^{\mathrm{T}}E\neq0_{1\times\ell}$, (4) is equal to $-\infty$. Now lets assume that $e^{\mathrm{T}}+\lambda^{\mathrm{T}}E=0_{1\times\ell}$. In this case (4) is equivalent to \begin{align} \min_{y\in\mathbb{R}^k}\frac{y^{\mathrm{T}}Py}{2}+\lambda^TDy-\lambda^Tw\tag{5}. \end{align} Since $P$ is positive definite, this problem is convex and you can find a unique minimum by taking the gradient of $\frac{y^{\mathrm{T}}Py}{2}+\lambda^TDy-\lambda^Tw$ (wrt $y$) and setting it equal to zero. Computing the gradient and setting it equal to zero gives \begin{align} y^{\mathrm{T}}P+\lambda^{\mathrm{T}}D=0. \end{align} The minimizing $y$ is given by \begin{align} y^{*} = -(\lambda^{T}DP^{-1})^{\mathrm{T}} \end{align} Plugging this into (5) gives that when $e^{\mathrm{T}}+\lambda^{\mathrm{T}}E=0_{1\times\ell}$, (4) is equal to \begin{align} -\frac{\lambda^{\mathrm{T}}DP^{-1}D^{\mathrm{T}}\lambda}{2}-\lambda^{\mathrm{T}}w\tag{6}. \end{align} Combining what we've derived so far, we have the following "cases" for (4) \begin{align} \min_{\substack{y\in\mathbb{R}^{k}\\z\in\mathbb{R}^{\ell}}} \begin{bmatrix}y^\mathrm{T}&z^{\mathrm{T}}\end{bmatrix}\begin{bmatrix}\frac{1}{2}P& 0_{k\times\ell} \\ 0_{\ell\times k} &0_{\ell\times\ell}\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}+\begin{bmatrix}\lambda^{\mathrm{T}}D&e^{\mathrm{T}}+\lambda^{\mathrm{T}}E\end{bmatrix}\begin{bmatrix}y\\z\end{bmatrix}-\lambda^{T}w =\\ \begin{cases}-\frac{\lambda^{\mathrm{T}}DP^{-1}D^{\mathrm{T}}\lambda}{2}-\lambda^{\mathrm{T}}w,\text{ when }e^{\mathrm{T}}+\lambda^{\mathrm{T}}E=0_{1\times\ell}\\-\infty\text{, otherwise}\end{cases} \end{align} Now you pretty much sub the expression on the right hand side of the above equation into the dual equation (2). Define the function $f(\lambda)$ by \begin{align} f(\lambda) = \begin{cases}-\frac{\lambda^{\mathrm{T}}DP^{-1}D^{\mathrm{T}}\lambda}{2}-\lambda^{\mathrm{T}}w,\text{ when }e^{\mathrm{T}}+\lambda^{\mathrm{T}}E=0_{1\times\ell}\\-\infty\text{, otherwise}\end{cases} \end{align} e.g. (2) is equivalent to \begin{align} \max_{\lambda\in\mathbb{R}^{m}}f(\lambda)\tag{7} \end{align} Since (7) is a maximization and choosing $\lambda$ such that $e^{\mathrm{T}}+\lambda^{\mathrm{T}}E\neq0_{1\times\ell}$ gives $f(\lambda)=-\infty$, (7) can equivalently be written as the constrained program \begin{align} \max_{\substack{\lambda\in\mathbb{R}^{m}\\e^{\mathrm{T}}+\lambda^{\mathrm{T}}E=0_{1\times\ell}}}-\frac{\lambda^{\mathrm{T}}DP^{-1}D^{\mathrm{T}}\lambda}{2}-\lambda^{\mathrm{T}}w \end{align} which is about as simple of a way to write the dual problem that I can think of. To simplify further you need some more assumptions.