Getting the submatrix with maximum sum?

Here's an explanation to go with the posted code. There are two key tricks to make this work efficiently: (I) Kadane's algorithm and (II) using prefix sums. You also need to (III) apply the tricks to the matrix.

Part I: Kadane's algorithm

Kadane's algorithm is a way to find a contiguous subsequence with maximum sum. Let's start with a brute force approach for finding the max contiguous subsequence and then consider optimizing it to get Kadane's algorithm.

Suppose you have the sequence:

-1,  2,  3, -2

For the brute force approach, walk along the sequence generating all possible subsequences as shown below. Considering all possibilities, we can start, extend, or end a list with each step.

At index 0, we consider appending the -1
-1,  2,  3, -2
 ^
Possible subsequences:
-1   [sum -1]

At index 1, we consider appending the 2
-1,  2,  3, -2
     ^
Possible subsequences:
-1 (end)      [sum -1]
-1,  2        [sum  1]
 2            [sum  2]

At index 2, we consider appending the 3
-1,  2,  3, -2
         ^
Possible subsequences:
-1, (end)       [sum -1]
-1,  2 (end)    [sum -1]
 2 (end)        [sum 2]
-1,  2,  3      [sum 4]
 2,  3          [sum 5]
 3              [sum 3]

At index 3, we consider appending the -2
-1,  2,  3, -2
             ^
Possible subsequences:
-1, (end)          [sum -1]
-1,  2 (end)       [sum  1]
 2 (end)           [sum  2]
-1,  2  3 (end)    [sum  4]
 2,  3 (end)       [sum  5]
 3, (end)          [sum  3]
-1,  2,  3, -2     [sum  2]
 2,  3, -2         [sum  3]
 3, -2             [sum  1]
-2                 [sum -2]

For this brute force approach, we finally pick the list with the best sum, (2, 3), and that's the answer. However, to make this efficient, consider that you really don't need to keep every one of the lists. Out of the lists that have not ended, you only need to keep the best one, the others cannot do any better. Out of the lists that have ended, you only might need to keep the best one, and only if it's better than ones that have not ended.

So, you can keep track of what you need with just a position array and a sum array. The position array is defined like this: position[r] = s keeps track of the list which ends at r and starts at s. And, sum[r] gives a sum for the subsequence ending at index r. This is optimized approach is Kadane's algorithm.

Running through the example again keeping track of our progress this way:

At index 0, we consider appending the -1
-1,  2,  3, -2
 ^
We start a new subsequence for the first element.
position[0] = 0
sum[0] = -1

At index 1, we consider appending the 2
-1,  2,  3, -2
     ^
We choose to start a new subsequence because that gives a higher sum than extending.
position[0] = 0      sum[0] = -1
position[1] = 1      sum[1] = 2


At index 2, we consider appending the 3
-1,  2,  3, -2
         ^
We choose to extend a subsequence because that gives a higher sum than starting a new one.
position[0] = 0      sum[0] = -1
position[1] = 1      sum[1] = 2
position[2] = 1      sum[2] = 5

Again, we choose to extend because that gives a higher sum that starting a new one.
-1,  2,  3, -2
             ^
position[0] = 0      sum[0] = -1
position[1] = 1      sum[1] = 2
position[2] = 1      sum[2] = 5
positions[3] = 3     sum[3] = 3

Again, the best sum is 5 and the list is from index 1 to index 2, which is (2, 3).

Part II: Prefix sums

We want to have a way to compute the sum along a row, for any start point to any endpoint. I want to compute that sum in O(1) time rather than just adding, which takes O(m) time where m is the number of elements in the sum. With some precomputing, this can be achieved. Here's how. Suppose you have a matrix:

a   d   g
b   e   h 
c   f   i

You can precompute this matrix:

a      d      g
a+b    d+e    g+h
a+b+c  d+e+f  g+h+i

Once that is done you can get the sum running along any column from any start to endpoint in the column just by subtracting two values.

Part III: Bringing tricks together to find the max submatrix

Assume that you know the top and bottom row of the max submatrix. You could do this:

  1. Ignore rows above your top row and ignore rows below your bottom row.
  2. With what matrix remains, consider the using sum of each column to form a sequence (sort of like a row that represents multiple rows). (You can compute any element of this sequence rapidly with the prefix sums approach.)
  3. Use Kadane's approach to figure out best subsequence in this sequence. The indexes you get will tell you the left and right positions of the best submatrix.

Now, what about actually figuring out the top and bottom row? Just try all possibilities. Try putting the top anywhere you can and putting the bottom anywhere you can, and run the Kadane-base procedure described previously for every possibility. When you find a max, you keep track of the top and bottom position.

Finding the row and column takes O(M^2) where M is the number of rows. Finding the column takes O(N) time where N is the number of columns. So total time is O(M^2 * N). And, if M=N, the time required is O(N^3).


About recovering the actual submatrix, and not just the maximum sum, here's what I got. Sorry I do not have time to translate my code to your java version, so I'm posting my Ruby code with some comments in the key parts

def max_contiguous_submatrix_n3(m)
  rows = m.count
  cols = rows ? m.first.count : 0

  vps = Array.new(rows)
  for i in 0..rows
    vps[i] = Array.new(cols, 0)
  end

  for j in 0...cols
    vps[0][j] = m[0][j]
    for i in 1...rows
      vps[i][j] = vps[i-1][j] + m[i][j]
    end
  end

  max = [m[0][0],0,0,0,0] # this is the result, stores [max,top,left,bottom,right]
  # these arrays are used over Kadane
  sum = Array.new(cols) # obvious sum array used in Kadane
  pos = Array.new(cols) # keeps track of the beginning position for the max subseq ending in j

  for i in 0...rows
    for k in i...rows
      # Kadane over all columns with the i..k rows
      sum.fill(0) # clean both the sum and pos arrays for the upcoming Kadane
      pos.fill(0)
      local_max = 0 # we keep track of the position of the max value over each Kadane's execution
      # notice that we do not keep track of the max value, but only its position
      sum[0] = vps[k][0] - (i==0 ? 0 : vps[i-1][0])
      for j in 1...cols
        value = vps[k][j] - (i==0 ? 0 : vps[i-1][j])
        if sum[j-1] > 0
          sum[j] = sum[j-1] + value
          pos[j] = pos[j-1]
        else
          sum[j] = value
          pos[j] = j
        end
        if sum[j] > sum[local_max]
          local_max = j
        end
      end
      # Kadane ends here

      # Here's the key thing
      # If the max value obtained over the past Kadane's execution is larger than
      # the current maximum, then update the max array with sum and bounds
      if sum[local_max] > max[0]
        # sum[local_max] is the new max value
        # the corresponding submatrix goes from rows i..k.
        # and from columns pos[local_max]..local_max
        # the array below contains [max_sum,top,left,bottom,right]
        max = [sum[local_max], i, pos[local_max], k, local_max]
      end
    end
  end

  return max # return the array with [max_sum,top,left,bottom,right]
end

Some notes for clarification:

I use an array to store all the values pertaining to the result for convenience. You can just use five standalone variables: max, top, left, bottom, right. It's just easier to assign in one line to the array and then the subroutine returns the array with all the needed information.

If you copy and paste this code in a text-highlight-enabled editor with Ruby support you'll obviously understand it better. Hope this helps!