Find top N elements in an Array

Solution 1:

The time could be reduced to linear time:

  1. Use the selection algorithm, which effectively find the k-th element in a un-sorted array in linear time. You can either use a variant of quick sort or more robust algorithms.

  2. Get the top k using the pivot got in step 1.

Solution 2:

How about delegating everything to Java ;)

function findTopN(Array list, int n)
{
    Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());

    // add all elements from list to sortedSet

    // return the first n from sortedSet
}

I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.

Solution 3:

If you're dealing with simple elements like fixed-length integers, then provided you can spare a memory buffer of the same size as the input data, sorting can be done in O(n) time using bucket or radix sorts, and this will be the fastest.

Although there are linear-time selection algorithms, the hidden constant is very high -- around 24. That means an O(nlog n) algorithm will be typically faster for fewer than several million elements.

Otherwise, in the general case when you can only compare 2 elements and determine which is greater, the problem is best solved by a heap data structure.

Suppose you want the top k of n items. All solutions based on fully sorting the data require O(nlog n) time, while using a heap requires only O(nlog k) time -- just build a heap on the first k elements, then keep adding an element and removing the maximum. This will leave you with a heap containing the smallest k elements.