What is the most efficient algorithm to find a straight line that goes through most points?

The problem:

N points are given on a 2-dimensional plane. What is the maximum number of points on the same straight line?

The problem has O(N2) solution: go through each point and find the number of points which have the same dx / dy with relation to the current point. Store dx / dy relations in a hash map for efficiency.

Is there a better solution to this problem than O(N2)?


Solution 1:

There is likely no solution to this problem that is significantly better than O(n^2) in a standard model of computation.

The problem of finding three collinear points reduces to the problem of finding the line that goes through the most points, and finding three collinear points is 3SUM-hard, meaning that solving it in less than O(n^2) time would be a major theoretical result.

See the previous question on finding three collinear points.

For your reference (using the known proof), suppose we want to answer a 3SUM problem such as finding x, y, z in list X such that x + y + z = 0. If we had a fast algorithm for the collinear point problem, we could use that algorithm to solve the 3SUM problem as follows.

For each x in X, create the point (x, x^3) (for now we assume the elements of X are distinct). Next, check whether there exists three collinear points from among the created points.

To see that this works, note that if x + y + z = 0 then the slope of the line from x to y is

(y^3 - x^3) / (y - x) = y^2 + yx + x^2

and the slope of the line from x to z is

(z^3 - x^3) / (z - x) = z^2 + zx + x^2 = (-(x + y))^2 - (x + y)x + x^2 = x^2 + 2xy + y^2 - x^2 - xy + x^2 = y^2 + yx + x^2

Conversely, if the slope from x to y equals the slope from x to z then

y^2 + yx + x^2 = z^2 + zx + x^2,

which implies that

(y - z) (x + y + z) = 0,

so either y = z or z = -x - y as suffices to prove that the reduction is valid.

If there are duplicates in X, you first check whether x + 2y = 0 for any x and duplicate element y (in linear time using hashing or O(n lg n) time using sorting), and then remove the duplicates before reducing to the collinear point-finding problem.

Solution 2:

If you limit the problem to lines passing through the origin, you can convert the points to polar coordinates (angle, distance from origin) and sort them by angle. All points with the same angle lie on the same line. O(n logn)

I don't think there is a faster solution in the general case.

Solution 3:

The Hough Transform can give you an approximate solution. It is approximate because the binning technique has a limited resolution in parameter space, so the maximum bin will give you some limited range of possible lines.

Solution 4:

Again an O(n^2) solution with pseudo code. Idea is create a hash table with line itself as the key. Line is defined by slope between the two points, point where line cuts x-axis and point where line cuts y-axis.

Solution assumes languages like Java, C# where equals method and hashcode methods of the object are used for hashing function.

Create an Object (call SlopeObject) with 3 fields

  1. Slope // Can be Infinity
  2. Point of intercept with x-axis -- poix // Will be (Infinity, some y value) or (x value, 0)
  3. Count

poix will be a point (x, y) pair. If line crosses x-axis the poix will (some number, 0). If line is parallel to x axis then poix = (Infinity, some number) where y value is where line crosses y axis. Override equals method where 2 objects are equal if Slope and poix are equal.

Hashcode is overridden with a function which provides hashcode based on combination of values of Slope and poix. Some pseudo code below

Hashmap map;
foreach(point in the array a) {
    foeach(every other point b) {
        slope = calculateSlope(a, b);
        poix = calculateXInterception(a, b);
        SlopeObject so = new SlopeObject(slope, poix, 1); // Slope, poix and intial count 1.
        SlopeObject inMapSlopeObj = map.get(so);
        if(inMapSlopeObj == null) {
            inMapSlopeObj.put(so);
        } else {
            inMapSlopeObj.setCount(inMapSlopeObj.getCount() + 1);
        }
    }
}
SlopeObject maxCounted = getObjectWithMaxCount(map);
print("line is through " + maxCounted.poix + " with slope " + maxCounted.slope);