How to take the derivative of a matrix with respect to itself?

$\def\p#1#2{\frac{\partial #1}{\partial #2}}$A matrix/matrix gradient produces a 4th order tensor, which is easily evaluated in index notation $$\eqalign{ {\cal E} &= \p{X}{X} \quad\implies\quad {\cal E}_{ijk\ell} &= \p{X_{ij}}{X_{k\ell}} &= \delta_{ik}\delta_{j\ell} \\ }$$ where the Kronecker delta symbol is defined as $$\eqalign{ \delta_{ik} &= \begin{cases} {\tt1}\quad {\rm if}\;i=k \\ 0\quad {\rm otherwise} \end{cases} }$$ In words:   If $X_{ij}$ and $X_{k\ell}$ refer to different elements then the derivative is $0$, otherwise it's $\tt1$.

This is analogous to the vector/vector derivative which produces the identity matrix $$\eqalign{ I &= \p{x}{x} \quad\implies\quad {I}_{ij} &= \p{x_i}{x_j} &= \delta_{ij} \\ }$$


The underlying mapping is $$ f(X)=X, $$ the identity mapping on the vector space $V$ of all matrices. It is linear, hence its derivative at $X$ in direction $\delta X$ is $$ f'(X)\delta X=\delta X, $$ which is $$ f'(X) = f. $$ Note, that both $f$ and $f'(X)$ are linear mappings from $V$ to $V$. The mapping $f'$ is a mapping from $V$ to $L(V,V)$.