Why is it impossible to find a specified value in a sorted array faster than O(log n)?

Solution 1:

Let's imagine that you have an array of n items. If you perform a lookup in this array, then that lookup can return one of n + 1 values: either "the item isn't present," or "the item is present at index i" for any of the n indices.

Now, suppose that the only way that your algorithm is allowed to work with the array is by asking questions of the form "is the item greater than or equal to the item in index i?" for some choice of i, and let's imagine that you ask some question of this form k total times. There are then 2k possible ways that the comparisons could come back. To see why, there are two options for the how the first comparison can go (either "yes" or "no"). There are two options for how the second comparison can go (either "yes" or "no"), and two options for the third comparison. Multiplying all those 2s together gives 2k.

We now have two constraints:

  1. Any correct algorithm must be able to return one of n + 1 different options.
  2. With k comparisons, there are 2k possible ways those comparisons can work out.

This means that we have to have n + 1 ≤ 2k, since otherwise there aren't enough possible outcomes from the search algorithm to be able to cover all n + 1 possible outcomes. Taking the log base two of both sides gives lg (n + 1) ≤ k, so the number of comparisons made must be Ω(log n).

Stated differently - if your algorithm makes too few queries, there aren't enough possible ways for those comparisons to go to ensure that every possible option can be produced.


Of course, if you aren't in the comparison model, you can outperform searches in an array using hash tables. Some hash tables give expected O(1) lookups, while others (say, cuckoo hashing) give worst-case O(1) lookups.

Moving outside the comparison model, there are algorithms that, subject to certain assumptions, have expected runtimes lower than O(log n). For example, interpolation search can find items in sorted arrays in expected time O(log log n), provided that the data are sampled from a uniform distribution. It works by making numeric estimates of where in the sequence to choose the next item to probe, and works well in practice.

On the more theoretical side of things, fusion trees can perform searches in time O(log n / log w), where w is the machine word size, provided that the values are integers that fit into a single machine word. This can be improved down to the surprising runtime of O(sqrt(log n / log log n)). It's known that if the n values each fit into a single machine word, then the predecessor lower bound says you can't do better than the (very unusual runtime of) O(min{log w / log log w, sqrt(log n / log log n)}), where w is the machine word size. These algorithms outperform the Ω(log n) lower bound by making multiple comparisons in parallel using creative operations on individual machine words.

Solution 2:

To start with be careful about using the word "faster" when talking about Big-O complexity as it's done in the question title. Big-O says nothing about how fast an algorithm is. Big-O only tells how execution time changes when some variable N changes. Example:

O(1) algorithm   : Doubling N will not change execution time
O(N) algorithm   : Doubling N will double execution time (1 + 1 = 2) 
O(N^2) algorithm : Doubling N will quadruple execution time (2 * 2 = 4)

Also notice that for some N values a O(N^2) algorithm may be faster than an O(N) algorithm. Big-O doesn't tell anything about that. All we can say is that if we keep increasing N then sooner or later the O(N) algorithm will be faster than the O(N^2) algorithm. But Big-O doesn't tell what that value of N is. It can be for N=1, N=10, N=100, ... So be careful about "translating" Big-O complexity into "fast".

Big-O is calculated as the number of times you need to perform some basic operation (an O(1) operation) as function of the variable N. Further, Big-O (normally) looks at worst case scenarios.

For an ordinary array the only basic operation we can perform is to look-up the value at some index i in the range 1..N

In a sorted array the returned value can be used to tell us three things (see later paragraph for exceptions):

  1. Is the value less than the value we are searching for
  2. Is the value greater than the value we are searching for
  3. Is the value equal to the value we are searching for

Now remember that Big-O is about worst case scenarious. So number 3 won't happen unless we are looking in a range with only one array element. So forget about number 3 for now.

Because the array is sorted the relevant answers can be translated into

  1. The search value is in range 1 .. i
  2. The search value is in range i+1 .. N

Now the question is: How to select the best value of i for the first look up?

Since Big-O is a worst case calculation, we shall always use the largest of the two ranges for the next step. To make the largest range as small as possible, we need to make the two ranges the same size. To do that we need i to be equal N/2. This is simply the best we can do with the knowledge we have from the look-up.

By doing that we have

  1. The search value is in range 1 .. N/2
  2. The search value is in range N/2+1 .. N

So in the next step we need to look in a range with N/2 elements.

Now apply the same again (i.e. i = N/2/2) for the next search to get down to N/4 element. Do it again to get to N/8 elements and so...

Repeat that until there is 1 element in the range - then we have the solution (number 3 above).

So each look-up reduce the range into half of it's original size. And k look-up will reduce the range by 2^k. Our target is to get a range size of 1 so we need to solve:

N / 2^k = 1 <=> N = 2^K <=> K = log2(N)

So from the assumption that all we can know from a look-up is whether our search value is to the left or right of the look-up position and from the fact that Big-O is worst case calculated, we can see that the search in a sorted array can't be any better than log(N) complexity.

The above covers the general array but notice that for some arrays the assumption may not hold. Sometimes an algorithm may extract more information from the look-up'ed value at index i. For instance if we know something about the distribution of values in the array and our search value is far from the look-up value, an algorithm may benefit from doing something else in the next look-up than doing the next look-up at N/2 and thereby be more efficient.

Solution 3:

If you do not know preliminary info about keys distribution, really, your search is O(log(n)), because of each comparison you extract 1 bit if information, and reduce search area twice. But, for practical cases, you can search in sorted array much fastest. Fro instance, see the Interpolation search, it is just O(log(log(n))).

Solution 4:

The running time of an algorithm cannot be less than the size of its output, because at the very least it has to write that output. Here, the output of an algorithm which searches a value in a sorted array is the index of the value found in the array (even if your actual algorithm produces something different, it would still have to determine that index internally). It is an integer between 0 and the number N of items of the array.

Encoding a value among N possible values requires a worst-case size of Ω(log(N)). Indeed, let’s say you have an alphabet of two symbols: 0 and 1. There are 2k strings of length k, and there are 1 + 2 + 22 + … + 2k−1 = 2k−1 strings of length strictly less than k. Hence, if you want an encoding for N = 2k (or more) distinct values, it is not possible that all values are encoded with less than k symbols. Necessarily, at least one of them will be encoded with at least k symbols. So your worst-case size is an Ω(k) = Ω(log(N)).

For instance, to write any integer between 0 and 216, you need 16 binary digits.

A larger alphabet does not increase your expressive power. It only changes a constant factor, which is hidden by the asymptotic notations.

To conclude, the output is a value that you need to tell apart among N possible values, and for that you need Ω(log(N)) symbols in the worst case.

A remark: if you use two symbols (i.e. write integers in binary), and N = 2k (to simplify), then the dichotomic search algorithm that you surely know (or will know very soon) consists exactly in determining the k symbols (bits) of the output one by one, from the most significant bit to the least significant bit.

Solution 5:

The only info we have is it's sorted. Let's say we are in the middle of the array now you make a decision based on 1 comparison that if the left half has the answer or the right half has the answer or mid itself is the answer and the recursively doing the comparison till we find the element ... The point to note here is that you are able to do so is because the array is sorted now let assume we are standing at some random point now still you can apply the same algorithm but you might partition the array unevenly which might screw up your number of comparison. Let say your random point is always at starting or ending point of the array the algo will have the complexity of O(N) in the worst-case. So, The best way to search for an element in the next recursion call is to reduce the search space by half each time which you are able to do in only 1 comparison, now you cannot decide this in less than 1 comparison unless there is some kind of pre-processing done which will again take more than O(log(N)