performance of log10 function returning an int

Today I needed a cheap log10 function, of which I only used the int part. Assuming the result is floored, so the log10 of 999 would be 2. Would it be beneficial writing a function myself? And if so, which way would be the best to go. Assuming the code would not be optimized.

The alternatives to log10 I've though of;

use a for loop dividing or multiplying by 10;
use a string parser(probably extremely expensive);
using an integer log2() function multiplying by a constant.

Thank you on beforehand:)

The operation can be done in (fast) constant time on any architecture that has a count-leading-zeros or similar instruction (which is most architectures). Here's a C snippet I have sitting around to compute the number of digits in base ten, which is essentially the same task (assumes a gcc-like compiler and 32-bit int):

unsigned int baseTwoDigits(unsigned int x) {
    return x ? 32 - __builtin_clz(x) : 0;
}

static unsigned int baseTenDigits(unsigned int x) {
    static const unsigned char guess[33] = {
        0, 0, 0, 0, 1, 1, 1, 2, 2, 2,
        3, 3, 3, 3, 4, 4, 4, 5, 5, 5,
        6, 6, 6, 6, 7, 7, 7, 8, 8, 8,
        9, 9, 9
    };
    static const unsigned int tenToThe[] = {
        1, 10, 100, 1000, 10000, 100000, 
        1000000, 10000000, 100000000, 1000000000,
    };
    unsigned int digits = guess[baseTwoDigits(x)];
    return digits + (x >= tenToThe[digits]);
}

GCC and clang compile this down to ~10 instructions on x86. With care, one can make it faster still in assembly.

The key insight is to use the (extremely cheap) base-two logarithm to get a fast estimate of the base-ten logarithm; at that point we only need to compare against a single power of ten to decide if we need to adjust the guess. This is much more efficient than searching through multiple powers of ten to find the right one.

If the inputs are overwhelmingly biased to one- and two-digit numbers, a linear scan is sometimes faster; for all other input distributions, this implementation tends to win quite handily.

One way to do it would be loop with subtracting powers of 10. This powers could be computed and stored in table. Here example in python:

table = [10**i for i in range(1, 10)]
# [10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000]

def fast_log10(n):
    for i, k in enumerate(table):
        if n - k < 0:
           return i

Usage example:

>>> fast_log10(1)
0
>>> fast_log10(10)
1
>>> fast_log10(100)
2
>>> fast_log10(999)
2
fast_log10(1000)
3

Also you may use binary search with this table. Then algorithm complexity would be only O(lg(n)), where n is number of digits. Here example with binary search in C:

long int table[] = {10, 100, 1000, 10000, 1000000};
#define TABLE_LENGHT sizeof(table) / sizeof(long int)

int bisect_log10(long int n, int s, int e) {
    int a = (e - s) / 2 + s;
    if(s >= e)
        return s;
    if((table[a] - n) <= 0)
        return bisect_log10(n, a + 1, e);
    else
        return bisect_log10(n, s, a);
}

int fast_log10(long int n){
    return bisect_log10(n, 0, TABLE_LENGHT);
}

Note for small numbers this method would slower then upper method. Full code here.

performance of log10 function returning an int

Related

Recent Posts