Bits counting algorithm (Brian Kernighan) in an integer time complexity
Can someone explains why Brian Kernighan's algorithm takes O(log N) to count set bits (1s) in an integer. A simple implementation of this algorithm is below (in JAVA)
int count_set_bits(int n){
int count = 0;
while(n != 0){
n &= (n-1);
count++;
}
return count;
}
I understand how it works by clearing the rightmost set bit one by one until it becomes 0, but I just don't know how we get O(log N).
This algorithm goes through as many iterations as there are set bits. So if we have a 32-bit word with only the high bit set, then it will only go once through the loop. In the worst case, it will pass once per bit. An integer n
has log(n)
bits, hence the worst case is O(log(n))
. Here's your code annotated at the important bits (pun intended):
int count_set_bits(int n){
int count = 0; // count accumulates the total bits set
while(n != 0){
n &= (n-1); // clear the least significant bit set
count++;
}
}
There are floor(lg(N)) + 1 significant bits in N -- that's a base-2 logarithm. The number of 1 bits in n is at most this. So the time will have asymptotic upper bound O(lg(N)) = O(log(N)).
This question is really about meaning of N in big O notation, not complexity of algorithm.
N means size of the data. But in case where where "data" is a single number you need to define what you understand as size of data. The value of a number or length of it's representation.
IMO the algorithm is O(N). Because in this problem of counting 1 in binary representation IMO relevant size of data is length of the number's representation, not it's value i.e. the length of the bit stream. And obvious worst case is all 1's taking N iterations.
But if you consider value of N as size of the data, it's representation has log(N) length so you can say it's O(log(N))
PS Also big O notation makes sense only if you generalise algorithm for arbitrarily high Ns. In this code N is limited by int size, so you can say it's O(1) as it will be at most 64 loop iterations (for 64 bit ints)