LCP with sparse matrix
I indicate matrices by capital letters, and vectors by small letters.
I need to solve the following system of linear inequalities for vector v
:
min(rv - (u + Av), v - s) = 0
where 0
is a vector of zeros.
where r
is a scalar, u
and s
are vectors, and A
is a matrix.
Defining z = v-s
, B=rI - A
, q=-u + Bs
, I can rewrite the previous problem as a linear complementarity problem and hope to use an LCP solver, for example from openopt
:
LCP(M, z): min(Bz+q, z) = 0
or, in matrix notation:
z'(Bz+q) = 0
z >= 0
Bz + q >= 0
The problem is that my system of equations is huge. To create A
, I
- Create four matrices
A11
,A12
,A21
,A22
usingscipy.sparse.diags
- And stack them together as
A = scipy.sparse.bmat([[A11, A12], [A21, A22]])
- (This also means that
A
is not symmetric, and hence some efficient translations intoQP
problems won't work)
openopt.LCP
apparently cannot deal with sparse matrices: When I ran this, my computer crashed. Generally, A.todense()
will lead to a memory error. Similarly, compecon-python
is not able to solve LCP
problems with sparse matrices.
What alternative LCP
implementations are fit for this problem?
I really did not think sample data was required for a general "which tools to solve LCP" question were required, but anyways, here we go
from numpy.random import rand
from scipy import sparse
n = 3000
r = 0.03
A = sparse.diags([-rand(n)], [0])
s = rand(n,).reshape((-1, 1))
u = rand(n,).reshape((-1, 1))
B = sparse.eye(n)*r - A
q = -u + B.dot(s)
q.shape
Out[37]: (3000, 1)
B
Out[38]:
<3000x3000 sparse matrix of type '<class 'numpy.float64'>'
with 3000 stored elements in Compressed Sparse Row format>
Some more pointers:
-
openopt.LCP
crashes with my matrices, I assume it converts the matrices to dense before continuing -
compecon-python
outright fails with some error, it apparently requires dense matrices and has no fallback for sparsity -
B
is not positive semi-definite, so I cannot rephrase the linear complementarity problem (LCP) as a convex quadratic problem (QP) - All QP sparse solvers from this exposition require the problem to be convex, which mine is not
- In Julia, PATHSolver can solve my problem (given a license). However, there are problems calling it from Python with PyJulia (my issue report here)
- Also Matlab has an LCP solver that apparently can handle sparse matrices, but there implementation is even more wacky (and I really do not want to fallback on Matlab for this)
Solution 1:
This problem has a very efficient (linear time) solution, though it requires a bit of discussion...
Zeroth: clarifying the problem / LCP
Per clarifications in the comments, @FooBar says the original problem is elementwise min
; we need to find a z
(or v
) such that
either the left argument is zero and the right argument is non-negative or the left argument is non-negative and the right argument is zero
As @FooBar correctly points out, a valid reparameterization leads to an LCP. However, below I show that an easier and more efficient solution to the original problem can be achieved (by exploiting structure in the original problem) without needing the LCP. Why should this be easier? Well, notice that an LCP has a quadratic term in z
(Bz+q)'z, but that the original problem doesn't (only linear terms Bz+q and z). I'll exploit that fact below.
First: simplify
There is an important but key detail that simplifies this problem in a major way:
- Create four matrices A11, A12, A21, A22 using scipy.sparse.diags
- And stack them together as A = scipy.sparse.bmat([[A11, A12], [A21, A22]])
This has huge implications. Specifically, this is not a single large problem, but rather a large number of really small (2D, to be precise) problems. Notice that the block diagonal structure of this A
matrix is preserved throughout all subsequent operations. And every subsequent operation is a matrix-vector multiply or an inner product. This means that really this program is separable in pairs of z
(or v
) variables.
To be specific, suppose each block A11,...
is size n
by n
. Then notice critically that z_1
and z_{n+1}
appear only in equations and terms with themselves, and never elsewhere. So the problem is separable into n
problems, each of which is 2 dimensional. Thus, I will hereafter solve the 2D problem, and you can serialize or parallelize over n
as you see fit, without sparse matrices or big opt packages.
Second: the geometry of the 2D problem
Assume we have the elementwise problem in 2D, namely:
find z such that (elementwise) min( Bz + q , z ) = 0, or declare that no such `z` exists.
Because in our setup B
is now 2x2, this problem geometry corresponds to four scalar inequalities that we can manually enumerate (I’ve named them a1,a2,z1,z2):
“a1”: B11*z1 + B12*z2 + q1 >=0
“a2”: B21*z1 + B22*z2 + q2 >=0
“z1”: z1 >= 0
“z2:” z2 >= 0
This represents a (possibly empty) polyhedra, aka the intersection of four half spaces in 2d space.
Third: solving the 2D problem
(Edit: updated this bit for clarity)
What specifically is the 2D problem then? We want to find a z
where one of the following solutions (which are not all distinct, but that won’t matter) is achieved:
- a1>=0, z1=0, a2>=0, z2=0
- a1=0, z1>=0, a2=0, z2>=0
- a1>=0, z1=0, a2=0, z2>=0
- a1=0, z1>=0, a2>=0, z2=0
If one of these is achieved, we have a solution: a z where the elementwise minimum of z and Bz+q is the 0 vector. If none of the four can be achieved, the problem is infeasible and we will declare that no such z
exists.
The first case occurs when the intersection point of a1,a2 is in the positive orthant. Just choose the solution z = B^-1q, and check if it is elementwise nonnegative. If it is, return B^-1q
(note that, even though B is not psd, I am assuming it is invertible/full rank). So:
if B^-1q is elementwise nonnegative, return z = B^-1q.
The second case (not entirely distinct from the first) occurs when the intersection point of a1,a2 is anywhere but does contain the origin. In otherwords, whenever Bz+q >0 for z = 0. This occurs when q is elementwise positive. So:
elif q is elementwise nonnegative, return z = 0.
The third case has z1=0, which can be substituted into a2 to show that a2=0 when z2 = -q2/B22. z2 must be >=0, so -q2/B22 >=0. a1 must also be >=0, which substituting in the values for z1 and z2, gives -B22/B12*q2 + q1 >=0. So:
elif -q2/B22 >=0 and -B22/B12*q2 + q1 >=0, return z1= 0, z2 = -q2/B22.
The fourth step is the same as the third, but swapping 1 and 2. So:
elif -q1/B11 >=0 and -B21/B11*q1 + q2 >=0, return z1 = -q1/B11, z2 =0.
The final fifth case is when the problem is infeasible. That occurs when none of the above conditions are met. So:
else return None
Finally: put the pieces together
Solving each 2D problem is a couple of simple/efficient/trivial linear solves and compares. Each will return a pair of numbers or None
. Then just do same over all n
2D problems, and concatenate the vector. If any are None, the problem is infeasible (all None). Else, you have your answer.