Gradient and Hessian of a function with Matrix Variables

Solution 1:

The gradient is simple, at least if you have the matrix cookbook. $$\nabla f(X) = D - \frac{c}{2}\langle X, E\rangle^{-1/2} E$$ Assuming $D\neq 0$ and $E\neq 0$, it is clear that $\nabla f(X)=0$ is possible only if $D=\alpha E$ for some scalar $\alpha>0$. In that case, it is zero whenever $(c/2)\langle X,E\rangle^{-1/2}=\alpha$.

The Hessian isn't difficult either in concept---the challenge is writing it down. The Hessian of a vector function $g:\mathbb{R}^n\rightarrow\mathbb{R}$ can be represented by a symmetric matrix. But when you have a matrix function $f:\mathbb{R}^{m\times n}\rightarrow\mathbb{R}$, you can't represent it by a matrix anymore. Instead, the Hessian is a symmetric linear mapping. The best you can do, in my view, is look at the directional derivative. If $H$ is the search direction, then $$D^2f(X)[H,H] = \langle \nabla^2 f(X)[H],H\rangle = + \frac{c}{4} \langle X, E\rangle^{-3/2} \langle E, H \rangle^2.$$ Another way to look at it is that $\mathbb{R}^{m\times n}$ is isomorphic to $\mathbb{R}^{mn}$ via the vectorization function $\textbf{vec}$. If you define $$g:\mathbb{R}^{mn}\rightarrow\mathbb{R}, \quad g(x) \triangleq f({\textbf{vec}}^{-1}(x))$$ then $$\nabla g(x) = d - \frac{c}{2}(e^Tx)^{-1/2} e, \quad \nabla^2 g(x) = +\frac{c}{4}(e^Tx)^{-3/2} ee^T$$ where $d\triangleq\textbf{vec}(D)$ and $e\triangleq\textbf{vec}(E)$.

EDIT: The property that the matrices are semidefinite is largely irrelevant. However, it does ensure that $\langle X, E \rangle \geq 0$, so the function is well-defined over all of the desired values of $X$. Obviously, the function is not differentiable when $\langle X, E \rangle = 0$.