Difference between O(n) and O(log(n)) - which is better and what exactly is O(log(n))?
This is my first course in data structures and every lecture / TA lecture , we talk about O(log(n))
. This is probably a dumb question but I'd appreciate if someone can explain to me exactly what does it mean !?
It means that the thing in question (usually running time) scales in a manner that is consistent with the logarithm of its input size.
Big-O notation doesn't mean an exact equation, but rather a bound. For instance, the output of the following functions is all O(n):
f(x) = 3x
g(x) = 0.5x
m(x) = x + 5
Because as you increase x, their outputs all increase linearly - if there's a 6:1 ratio between f(n)
and g(n)
, there will also be approximately a 6:1 ratio between f(10*n)
and g(10*n)
and so on.
As for whether O(n)
or O(log n)
is better, consider: if n = 1000
, then log n = 3
(for log-base-10). Which would you rather have your algorithm take to run: 1000 seconds, or 3 seconds?
For the short answer, O(log n) is better than O(n)
Now what exactly is O( log n) ?
Generally, when referring to big O notation, log n refers to the base-2 logarithm, (same way ln represents base e logarithms). This base-2 logarithm is the inverse of an exponential function. An exponential function grows very rapidly and we can intuitively deduce that it's inverse will do the exact opposite i.e grows very slowly.
For example
x = O(log n)
We can represent n as ,
n= 2x
And
210 = 1024 → lg(1024) = 10
220 = 1,048,576 → lg(1048576) = 20
230 = 1,073,741,824 → lg(1073741824) = 30
Large increments in n only lead to a very small increase in log(n)
For a complexity of O(n) on the other hand, we get a linear relationship
A factor of log2n should be taken over A factor of n anytime.
To further solidify this, I came across an example in Algorithms Unlocked By Thomas Cormen
Consider 2 computers : A and B
Both Computers have a task of searching an array for a value Let's assume the arrays have 10 million elements to be searched through
Computer A- This computer can execute 1 billion instructions per second and is expected to perform the above task using an algorithm with a complexity of O(n). We can approximate the time is takes this computer to complete the task as
n/(instructions p second) → 107/10^9 = 0.01 seconds
Computer B- This computer is much more slower, and can execute only 10 million instructions per second. Computer B is expected to perform the above task using an algorithm with a complexity of O(log n). We can approximate the time is takes this computer to complete the task as
log(n) /(instructions p second) → log(107)/107 = 0.000002325349
With this illustration, we can see that even though computer A is much better than computer B,due to the algorithm used by B, it completes the task much quicker.
I think it should be very clear now why O(log(n)) is much faster than O(n)
For the input of size n
, an algorithm of O(n)
will perform steps perportional to n
, while another algorithm of O(log(n))
will perform steps roughly log(n)
.
Clearly log(n)
is smaller than n
hence algorithm of complexity O(log(n))
is better. Since it will be much faster.
O(logn) means that the algorithm's maximum running time is proportional to the logarithm of the input size. O(n) means that the algorithm's maximum running time is proportional to the input size.
basically, O(something) is an upper bound on the algorithm's number of instructions (atomic ones). therefore, O(logn) is tighter than O(n) and is also better in terms of algorithms analysis. But all the algorithms that are O(logn) are also O(n), but not backwards...