Proximal Operator / Proximal Mapping for Composition of Functions
Suppose I have a convex function $f(x)$ for which I can easily compute the proximal mapping
prox$_f(z) = \arg\min_{x} f(x) + \frac{1}{2}||x-z||^2_2$
is there a simple expression for the proximal mapping of $f(\mathcal{A}(x))$ where $\mathcal{A}(x)$ is linear? That is, can I express
$\arg\min_{x} f(\mathcal{A}(x)) + \frac{1}{2}||x-z||^2_2$
in terms of prox$_f(z)$?
Solution 1:
You might want to check the paper 'Signal recovery by proximal forward-backward splitting' by Combettes and Wajs (*) which reviews the prox and the prox of typical operations on functions (translation, scaling, …).
In that paper, you will find that if $\mathbf A$ is a semi-orthogonal linear transform (i.e.: $\mathbf A\mathbf A^* = \nu \mathbf{Id}$ with $\nu>0$) then there is a closed form of the prox of $f\circ \mathbf A$. I don't think there is one for the general case (I'd be tempted to say that it would be in that paper if it existed!)
The expression is the following:
$$ \mathrm{prox}_{f\circ \mathbf A}(x) = x + \nu^{-1} \mathbf{A}^*\left(\mathrm{prox}_{\nu f}(\mathbf A x) - \mathbf A x\right) \tag{1}$$
Hope this helps a little!
(*) Paper by Combettes and Wajs.
EDIT if you're interested in proving the above expression, here are the main steps of a possible proof:
- Express the definition of the problem in a dual form (Fenchel-Rockafellar) and write the first order optimality conditions
- Use Moreau's decomposition you should get something like $$ \nu^{-1}\mathbf A \mathbf d - y^* = \mathrm{prox}_{(\nu^{-1}f^\star)^\star}(\nu^{-1}\mathbf A\mathbf d)$$ where $y^*$ is the dual variable, and $^\star$ denotes the convex conjugate.
- express the convex conjugate of $(\nu^{-1}f^\star)$
- Shake the whole a little and using the optimality conditions, it should boil down to (1).
Another proof is using the fact that $\partial (f\circ \mathbf A) = \mathbf A^*\circ\partial f\circ \mathbf A$ (equality holds since $\mathbf A$ is surjective and hence the strong relint qualification condition holds, cf. Rockafellar's book). With Moreau decomposition $$x - \mathrm{prox}_{f\circ \mathbf A}(x) \in \mathbf A^*\partial f(\mathbf Ax)\in Im(\mathbf A^*)$$ multiply everything by $\mathbf A$ and it boils down to $$\mathbf A \mathrm{prox}_{f\circ\mathbf A}(x) = \mathrm{prox}_{\nu f}(\mathbf A x)$$ you can then use the decomposition of the prox in its orthogonal projection on $Im(\mathbf A^*)$ and on $ker(\mathbf A)$ and using the last equation you should end up with (1).