Solution 1:

The derivative you ask is a third-order tensor. We can easily compute $\frac{\partial f_n}{\partial \mathbf{X}}$ where $f_n$ is the $n$-th component of the vector $\mathbf{f}$.

Note that $f_n = \| \mathbf{y}_n \|$ where $\mathbf{y}_n = \mathbf{a}_n \mathbf{X}$ is a 1-by-3 row vector and $\mathbf{a}_n$ is the $n$-th row vector of $\mathbf{A}$.

From here \begin{eqnarray*} df_n &=& \frac{1}{f_n} \mathbf{y}_n : d\mathbf{y}_n = \frac{1}{f_n} \mathbf{y}_n : \mathbf{a}_n d\mathbf{X} \end{eqnarray*} The colon operator stands for the Frobenius inner product. The derivative is thus

$$ \frac{\partial f_n}{\partial \mathbf{X}} = \frac{1}{\| \mathbf{a}_n \mathbf{X} \|} \mathbf{a}_n^T \mathbf{y}_n $$

Solution 2:

$ \def\g#1#2{\frac{\partial #1}{\partial #2}} \def\LR#1{\left(#1\right)} \def\BR#1{\Big(#1\Big)} \def\bR#1{\big(#1\big)} \def\e{{\large\varepsilon}} \def\o{{\large\tt1}} \def\bbR#1{{\mathbb R}^{#1}} \def\diag#1{\operatorname{diag}\LR{#1}} \def\Diag#1{\operatorname{Diag}\!\bR{#1}} $ The requested gradient is a third-order tensor, which can be obtained by calculating the vector-valued derivatives $$\eqalign{ f\odot f &= \BR{(AX)\odot(AX)}\,\o \quad\doteq\; \diag{AXX^TA^T} \\ 2f\odot \g{f}{X_{ij}} &= \BR{2(AX)\odot(AE_{ij})}\,\o \\ \g{f}{X_{ij}} &= \Diag{f}^{-1}\cdot\BR{(AX)\odot(AE_{ij})}\,\o \\ &= \Diag{f}^{-1}\cdot\diag{AE_{ij}X^TA^T} \\ }$$ and summing them $$\eqalign{ \g{f}{X} &= \sum_{i=1}^n\sum_{j=1}^3\LR{\g{f}{X_{ij}}}\star E_{ij} \;\in\bbR{(n)\times n\times 3} \\ }$$ where $E_{ij}\in\bbR{n\times 3}$ is a standard basis matrix with component $(i,j)$ equal to $\tt1$ and all others equal to zeros, while $(\star)$ is the tensor product.

Or, as Steph has done, one can use matrix-valued derivatives for the construction $$\eqalign{ \g{f_k}{X} &= {A^T\,\Diag{\e_k\oslash f}\;AX} \\ \g{f}{X} &= \sum_{k=1}^n \e_{k}\star\LR{\g{f_k}{X}} \;\in\bbR{n\times(n\times 3)} \\ }$$ where $\e_{k}\in\bbR{n}$ is a standard basis vector with component $k$ equal to $\tt1$ and zeros elsewhere and $(\oslash)$ denotes Hadamard division.

In either case, note that you have a sum of $\,\big({\rm vector} \star {\rm matrix}\big)\,$ terms.


Perhaps this scalar-valued derivative provides the clearest answer $$\eqalign{ \g{f_k}{X_{ij}} &= \e_i^TA^T\,\Diag{\e_k\oslash f}\;AXe_j \\ }$$ This result can be rearranged into many equivalent forms $$\eqalign{ \g{f_k}{X_{ij}} &= E_{ij}:\BR{A^T\,\Diag{\e_k\oslash f}\;AX} \\ &= \LR{AE_{ij}X^TA^T}:\Diag{\e_k\oslash f} \\ &= \diag{AE_{ij}X^TA^T}:\bR{\e_k\oslash f} \\ &= \diag{AE_{ij}X^TA^T}:\Diag{f}^{-1}\e_k \\ &= \e_k^T\Diag{f}^{-1}\diag{AE_{ij}X^TA^T} \\ }$$ where $e_j$ is a standard basis vector for $\bbR 3$ and $(:)$ denotes the Frobenius product.