Calculate $\frac{∂L}{∂A}$ given $\frac{∂L}{∂G}$, $D=(A-\iota\cdot B^T)\odot\iota\cdot C^T$, and $G=D \odot(\iota\cdot E^T)+\iota\cdot F^T$

Solution 1:

$ \def\l{\lambda}\def\o{{\iota}}\def\p{\partial} \def\L{\left}\def\R{\right} \def\LR#1{\L(#1\R)} \def\vecc#1{\operatorname{vec}\LR{#1}} \def\diag#1{\operatorname{diag}\LR{#1}} \def\Diag#1{\operatorname{Diag}\LR{#1}} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\gg{\LR{\grad{\l}{G}}} \def\ga{\LR{\grad{\l}{A}}} $Let's use a convention wherein an uppercase letter denotes a matrix, a lowercase letter a vector, and a Greek letter a scalar. This means renaming the following problem variables $$\big\{B,C,E,F\big\}\to \big\{b,c,e,f\big\}$$ because we'll need to use those uppercase letters to denote diagonal matrices whose main diagonals are the lowercase letters, i.e. $$\eqalign{ B = \Diag{b},\quad C = \Diag{c},\quad E = \Diag{e},\quad I = \Diag{\o} = {\it Identity\;Matrix} }$$ Diagonal matrices can replace Hadamard products via the following rule $$\eqalign{ M\odot\LR{b\cdot c^T} &= B\cdot M\cdot C \\ }$$ Therefore $$\eqalign{ D &= {A\cdot C-\o\cdot b^T\cdot C} \\ G &= {D\cdot E-\o\cdot f^T} \\ }$$ Finally, let's use a colon to denote the Frobenius product $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A\cdot B^T} \\ A:A &= \big\|A\big\|^2_F \\ }$$ This is also called the double-dot or double contraction product.
When applied to vectors $(n=\tt1)$ it reduces to an ordinary dot product.

The properties of the underlying trace function allow the terms in such a product to be rearranged in many different but equivalent ways, e.g. $$\eqalign{ A:B &= B:A \\ A:B &= A^T:B^T \\ C:\LR{A\cdot B} &= \LR{C\cdot B^T}:A = \LR{A^T\cdot C}:B \\\\ }$$


Use the given gradient to write the differential of the function in terms of $G$, then change the independent variable from $G\to D\to A$, then recover the gradient wrt $A$. $$\eqalign{ d\l &= \gg:dG \\ &= \gg:\LR{dD\cdot E} \\ &= \LR{\gg\cdot E}:{dD} \\ &= \LR{\gg\cdot E}:\LR{dA\cdot C} \\ &= \LR{\gg\cdot E\cdot C}:{dA} \\ \ga &= \gg\cdot E\cdot C \;\;\doteq\; \gg\odot\LR{\o\cdot e^T}\odot\LR{\o\cdot c^T} \\ }$$ The other gradients can be calculated in a similar fashion.