Why the gradient of $\log{\det{X}}$ is $X^{-1}$, and where did trace tr() go??
Solution 1:
First of all, if you write (for a general function $f: U \to \mathbb R$, where $U \subset \mathbb R^K$)
$$f(y) \approx f(x) + Df(x) (y-x),$$
then term $Df(x) (y-x)$ is really
$$\sum_{i=1}^K D_i f \ (y_i - x_i).$$
Now the function $Z\mapsto \log\det (Z)$ are defined on an open set $S^n_{++}$ in $\mathbb R^{n^2}$, so it has $n^2$ coordinate given by $Z_{ij}$, where $i, j = 1, \cdots, n$.
Now take a look at
$$\begin{split} \text{tr} \left( X^{-1} (Z-X)\right) &= \sum_{i=1}^n \left(X^{-1} (Z-X) \right)_{ii}\\ &= \sum_{i=1}^n \sum_{j=1}^n X^{-1}_{ij} (Z_{ji}-X_{ji}) \\ \end{split}$$
Thus we should have identified $(X^{-1})^T$ as the gradient of $\log \det$.
Solution 2:
The trace $tr(X^{-1}(Z-X)$ is the standard inner product of $X^{-1}$ and $Z-X$. The choice of inner product is depend on the specific space.
So it doesn't mean that $tr(X^{-1}(Z-X)$=$Df(x)(Z-X)$(the latter uses matrix multiplication). It is the definition of inner product that matters.