Fit plane to a set of points in 3D: scipy.optimize.minimize vs scipy.linalg.lstsq

Least squares (scipy.linalg.lstsq) is guaranteed to converge. In fact, there is a closed form analytical solution (given by (A^T A)^-1 A^Tb (where ^T is matrix transpose and ^-1 is matrix inversion)

The standard optimization problem, however, is not generally solvable - we are not guaranteed to find a minimizing value. However, for the given equation, find some a, b, c such that z = a*x + b*y + c, we have a linear optimization problem (the constraints and objective are linear in the variables that we're trying to optimize for). Linear optimization problems are generally solvable, so scipy.optimize.minimize should converge to the optimal value.

Note: this is linear in our constraints even if we do z = a*x + b*y + d*x^2 + e*y^2 + f -- we don't have to restrict ourselves to a linear space of (x,y), since we will have these points (x, y, x^2, y^2) already. To our algorithm, these just look like points in the matrix A. So we can actually get a higher order polynomial using least squares!

A brief aside: In reality, all of the solvers which don't use an exact analytical solution generally stop within some acceptable range of the actual answer, so it is rarely the case that we get the exact solution, but it tends to be so close that we accept it as exact in practice. Additionally, even least-squares solvers rarely use the analytical solution and instead resort to something quicker like Newton's Method.

If you were to change the optimization problem, this would not be true. There are certain classes of problems for which we can generally find an optimal value (the largest class of these are called convex optimization problems -- although there are many non-convex problems which we can find an optimal value for under certain conditions).

If you're interested in learning more, take a look at Convex Optimization by Boyd and Vandenberghe. The first chapter doesn't require much mathematical background and it overviews the general optimization problem and how it relates to solvable optimization problems like linear and convex programming.


I would like to complete the answer with an alternative method in order to find the best plane that fit a set of points in R^3. Actually, the lstsq approach works pretty well except in specific cases where (for example) the x coordinate of all points is 0 (or the same). In such a case, the columns of the A matrix used in lstsq are not linearly independent. For example:

A = [[ 0   y_0    1]
     [ 0   y_1    1]
     ...
     [ 0   y_k    1] 
     ...
     [ 0   y_N    1]]

To circumvent this problem, you can use directly svd on the centered coordinates of the set of points. Actually, svd is used in the lstsq but not in the same matrix.

This is a python example given the coordinates of the points in the coords array :

# barycenter of the points
# compute centered coordinates
G = coords.sum(axis=0) / coords.shape[0]

# run SVD
u, s, vh = np.linalg.svd(coords - G)

# unitary normal vector
u_norm = vh[2, :]

Using this approach, the vh matrix is a 3x3 matrix which contains in its rows orthonormal vectors. The first two vectors form an orthonormal basis in the plane, the third one is a unit vector normal to the plane.

If you really need the a, b, c parameters, you can get them from the normal vector because the coordinates of the normal vector are (a, b, c), assuming the equation of the plane is ax + by + cz + d = 0.