Why is iterative k-way merge O(nk^2)?
k-way merge is the algorithm that takes as input k sorted arrays, each of size n. It outputs a single sorted array of all the elements.
It does so by using the "merge" routine central to the merge sort algorithm to merge array 1 to array 2, and then array 3 to this merged array, and so on until all k arrays have merged.
I had thought that this algorithm is O(kn) because the algorithm traverses each of the k arrays (each of length n) once. Why is it O(nk^2)?
Solution 1:
Because it doesn't traverse each of the k arrays once. The first array is traversed k-1 times, the first as merge(array-1,array-2), the second as merge(merge(array-1, array-2), array-3) ... and so on.
The result is k-1 merges with an average size of n*(k+1)/2 giving a complexity of O(n*(k^2-1)/2) which is O(nk^2).
The mistake you made was forgetting that the merges are done serially rather than in parallel, so the arrays are not all size n.
Solution 2:
Actually in the worst case scenario,there will be n comparisons for the first array, 2n for the second, 3n for the third and soon till (k - 1)n.
So now the complexity becomes simply
n + 2n + 3n + 4n + ... + (k - 1)n
= n(1 + 2 + 3 + 4 + ... + (k - 1))
= n((k - 1)*k) / 2
= n(k^2 - k) / 2
= O(nk ^ 2)
:-)
Solution 3:
How about this:
Step 1: Merge arrays (1 and 2), arrays (3 and 4), and so on. (k/2 array merges of 2n, total work kn).
Step 2: Merge array (1,2 and 3,4), arrays (5,6 and 7,8), and so on (k/4 merges of 4n, total work kn).
Step 3: Repeat...
There will be log(k) such "Steps", each with kn work. Hence total work done = O(k.n.log(k)).
Even otherwise, if we were to just sort all the elements of the array we could still merge everything in O(k.n.log(k.n)) time.