Numbers whose only prime factors are 2, 3, or 5 are called ugly numbers.

Example:

1, 2, 3, 4, 5, 6, 8, 9, 10, 12, 15, ... 

1 can be considered as 2^0.

I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large.

I wrote a trivial program that computes if a given number is ugly or not. For n > 500 - it became super slow. I tried using memoization - observation: ugly_number * 2, ugly_number * 3, ugly_number * 5 are all ugly. Even with that it is slow. I tried using some properties of log - since that will reduce this problem from multiplication to addition - but, not much luck yet. Thought of sharing this with you all. Any interesting ideas?

Using a concept similar to Sieve of Eratosthenes (thanks Anon)

    for (int i(2), uglyCount(0); ; i++) {
        if (i % 2 == 0)
            continue;
        if (i % 3 == 0)
            continue;
        if (i % 5 == 0)
            continue;
        uglyCount++;
        if (uglyCount == n - 1)
            break;
    }

i is the nth ugly number.

Even this is pretty slow. I am trying to find the 1500th ugly number.


Solution 1:

A simple fast solution in Java. Uses approach described by Anon..
Here TreeSet is just a container capable of returning smallest element in it. (No duplicates stored.)

    int n = 20;
    SortedSet<Long> next = new TreeSet<Long>();
    next.add((long) 1);

    long cur = 0;
    for (int i = 0; i < n; ++i) {
        cur = next.first();
        System.out.println("number " + (i + 1) + ":   " + cur);

        next.add(cur * 2);
        next.add(cur * 3);
        next.add(cur * 5);
        next.remove(cur);
    }

Since 1000th ugly number is 51200000, storing them in bool[] isn't really an option.

edit
As a recreation from work (debugging stupid Hibernate), here's completely linear solution. Thanks to marcog for idea!

    int n = 1000;

    int last2 = 0;
    int last3 = 0;
    int last5 = 0;

    long[] result = new long[n];
    result[0] = 1;
    for (int i = 1; i < n; ++i) {
        long prev = result[i - 1];

        while (result[last2] * 2 <= prev) {
            ++last2;
        }
        while (result[last3] * 3 <= prev) {
            ++last3;
        }
        while (result[last5] * 5 <= prev) {
            ++last5;
        }

        long candidate1 = result[last2] * 2;
        long candidate2 = result[last3] * 3;
        long candidate3 = result[last5] * 5;

        result[i] = Math.min(candidate1, Math.min(candidate2, candidate3));
    }

    System.out.println(result[n - 1]);

The idea is that to calculate a[i], we can use a[j]*2 for some j < i. But we also need to make sure that 1) a[j]*2 > a[i - 1] and 2) j is smallest possible.
Then, a[i] = min(a[j]*2, a[k]*3, a[t]*5).

Solution 2:

I am working on finding nth ugly number. Note that these numbers are extremely sparsely distributed as n gets large.

I wrote a trivial program that computes if a given number is ugly or not.

This looks like the wrong approach for the problem you're trying to solve - it's a bit of a shlemiel algorithm.

Are you familiar with the Sieve of Eratosthenes algorithm for finding primes? Something similar (exploiting the knowledge that every ugly number is 2, 3 or 5 times another ugly number) would probably work better for solving this.

With the comparison to the Sieve I don't mean "keep an array of bools and eliminate possibilities as you go up". I am more referring to the general method of generating solutions based on previous results. Where the Sieve gets a number and then removes all multiples of it from the candidate set, a good algorithm for this problem would start with an empty set and then add the correct multiples of each ugly number to that.

Solution 3:

My answer refers to the correct answer given by Nikita Rybak. So that one could see a transition from the idea of the first approach to that of the second.

from collections import deque
def hamming():
    h=1;next2,next3,next5=deque([]),deque([]),deque([])
    while True:
        yield h
        next2.append(2*h)
        next3.append(3*h)
        next5.append(5*h)
        h=min(next2[0],next3[0],next5[0])
        if h == next2[0]: next2.popleft()
        if h == next3[0]: next3.popleft()
        if h == next5[0]: next5.popleft()

What's changed from Nikita Rybak's 1st approach is that, instead of adding next candidates into single data structure, i.e. Tree set, one can add each of them separately into 3 FIFO lists. This way, each list will be kept sorted all the time, and the next least candidate must always be at the head of one ore more of these lists.

If we eliminate the use of the three lists above, we arrive at the second implementation in Nikita Rybak' answer. This is done by evaluating those candidates (to be contained in three lists) only when needed, so that there is no need to store them.

Simply put:

In the first approach, we put every new candidate into single data structure, and that's bad because too many things get mixed up unwisely. This poor strategy inevitably entails O(log(tree size)) time complexity every time we make a query to the structure. By putting them into separate queues, however, you will see that each query takes only O(1) and that's why the overall performance reduces to O(n)!!! This is because each of the three lists is already sorted, by itself.