Why is the optimal solution to $ \min\limits_{S \in \mathbb S^{n}_{+}} -\log(\det(S))+\text{trace}(\Sigma S)$ equal to $\Sigma^{-1}$

My reformulation of this problem is first to possibly determine the eigenvalues of the matrix S.

Now, the reformulated problem is examine the log sum of the reciprocal of these respective eigenvalues of S (this is the −log(det(S)) part) such that their sum equals the trace of inverse of the given Σ matrix.

Trimming the largest values created from the smallest (near zero) eigenvalues is a place to start.

Re-apply transformation matrix to produce a final test matrix solution.


Define $F : \mathbb{S}_+^n \to (-\infty, +\infty]$ by

$$ F(S) = -\log\det S + \operatorname{tr}(\Sigma S). $$

Also, denote by $\operatorname{int}(\mathbb{S}_+^{n})$ the set of all $n\times n$ positive-definite matrices. In this answer, we will only consider the case where $\Sigma \in \operatorname{int}(\mathbb{S}_+^{n})$.

Let $S \in \operatorname{int}(\mathbb{S}_+^{n})$. Then for any $n\times n$ symmetric $H$, there exists $\delta > 0$ such that $\| t S^{-1}H \| < 1$ and $S + tH \in \operatorname{int}(\mathbb{S}_+^{n})$ for all $|t| < \delta$, where $\| \cdot \|$ is the operator norm. Moreover, for such $t$, we get

\begin{align*} F(S + t H) &= F(S) - \log \det (I + t S^{-1}H) + \operatorname{tr}(\Sigma H) t \\ &= F(S) - \operatorname{tr} \log (I + t S^{-1}H) + \operatorname{tr}(\Sigma H) t . \end{align*}

From this, it follows that

$$ \left. \frac{\mathrm{d}}{\mathrm{d}t} \right|_{t=0} F(S + t H) = \operatorname{tr}((\Sigma - S^{-1})H) \tag{1} $$

and

$$ \left. \frac{\mathrm{d}^2}{\mathrm{d}t^2} \right|_{t=0} F(S + t H) = \operatorname{tr}(S^{-1}HS^{-1}H) = \operatorname{tr}\bigl((S^{-1/2}HS^{-1/2})^2\bigr) \geq 0 \tag{2} $$

with the equality if and only if $H = 0$.

Then $\text{(2)}$ shows that $F$ is strictly convex on $\operatorname{int}(\mathbb{S}_+^n)$, hence by continuity, $F$ is convex on $\mathbb{S}_+^n$. Moreover, $\text{(1)}$ shows that $S = \Sigma^{-1}$ is the unique critical point of $F$. Therefore $\Sigma^{-1}$ is the unique minimizer of $F$ as required.


Minimum does not exist if $\Sigma$ is singular. In fact, if you define $Sx=kx$ when $x\in\ker\Sigma$ and $Sx=x$ when $x\in(\ker\Sigma)^\perp=\operatorname{range}(\Sigma)$, then $$ \lim_{k\to+\infty}\bigl(-\log\det(S)+\operatorname{tr}(\Sigma S)\bigr) =\lim_{k\to+\infty}\bigl(-\operatorname{nullity}(\Sigma)\log(k)+\operatorname{rank}(\Sigma)\bigr) =-\infty. $$

Suppose $\Sigma$ is nonsingular. That is, suppose it is positive definite. Then $P=\Sigma^{1/2}S\Sigma^{1/2}$ is positive definite too. Denote the spectrum of $P$ by $\sigma$. Then \begin{aligned} -\log\det(S)+\operatorname{tr}(\Sigma S) &=\log\det\Sigma-\log\det(P)+\operatorname{tr}(P)\\ &=\log\det\Sigma+\sum_{\lambda\in\sigma}(-\log\lambda+\lambda). \end{aligned} Since $-\log\lambda+\lambda$ attains its only global minimum over $(0,\infty)$ at $\lambda=1$, the objective function attains minimum exactly when $P=I$. Hence $S=\Sigma^{-1}$ is the unique global minimiser.