Let $x_0 \in \mathbb{R}^n$ and $A \in \mathbb{R}^{n \times n}$ be given. Define the set $S$ as $$ S \triangleq \{x \in \mathbb{R}^n: A x \leq 1\}. $$

I want to compute the projection of $x_0$ onto $S$, i.e. the closest point to $x_0$ in $S$. Is there any efficient way to do this? Of course, this can be formulated as a quadratic problem but if I want to use gradient descent to solve this I need the projections step as a subroutine. As of now I am trying to derive a closed form expression but that seems futile.


Solution 1:

As mentioned in other answers and comments, the problem you need to solve is an inequality constrained, strongly convex QP:

$$ \begin{aligned} \mathrm{minimize}\ &\tfrac{1}{2}\|x\|^2_2 \\ \mathrm{subject\ to}\ & Ax \leq b. \end{aligned} $$

There's little chance you can use some direct method here (this possibility is discussed here), so you should resort to iterative methods (and if you need this operation into an outer iterative routine, you will obtain a doubly-iterative algorithm).

There are many iterative methods that you can use to solve this type of problems. Many of them will rely on a projection: note that this will be as simple as projecting onto the nonnegative orthant (which you can do in closed form) and not onto the polyhedron itself (which of course is the ultimate goal).

The fast proximal gradient method is arguably the simplest option. In this case you would apply such type of algorithms to the dual of your original problem, which reads

$$ \begin{aligned} \mathrm{minimize}\ &\tfrac{1}{2}\|A^\top y\|^2_2 + \langle b,y \rangle \\ \mathrm{subject\ to}\ & y \geq 0 \end{aligned} $$

(check that all signs in the above problem are correct). In fact, (fast) dual PGM in this case is precisely the (fast) dual projected gradient method. This is the simplest method I can see: only matrix-vector products by $A$ and $A^\top$ are to be computed, apart from trivial projections onto the nonnegative orthant and vector sums.

Alternatively you can apply the Douglas-Rachford splitting algorithm to the dual, which is the alternating direction method of multipliers (ADMM). In this case you would probably require solving linear systems along the iterations, either with a direct method or with some iterative procedure (like Krylov subspace methods), depending on the shape/structure of $A$.

Furthermore, active-set and interior-point methods are also options, depending on the size of the problems you're handling (see also the top answer to this similar question for this).

There's really no ultimate answer: which approach is the best fit always depends on the particular structure of the problem, and the accuracy of the solution that you need. However, the methods I suggested above are general

Solution 2:

This is a quadratic programming problem: minimize $(x - x_0)^T (x - x_0)$ subject to $A x \le 1$. The quadratic form is positive definite.

EDIT: You might also note that any quadratic programming problem whose quadratic form is positive definite is equivalent to your problem by a change of variables (diagonalizing the quadratic form). So there's really nothing special here.

Efficient algorithms for solving quadratic programming problems with positive definite quadratic forms are well known: see the Wikipedia link. Software such as Maple and Cplex have implementations of such algorithms.