Sum of digits of a factorial

I'm not sure who is still paying attention to this thread, but here goes anyway.

First, in the official-looking linked version, it only has to be 1000 factorial, not 10000 factorial. Also, when this problem was reused in another programming contest, the time limit was 3 seconds, not 1 second. This makes a huge difference in how hard you have to work to get a fast enough solution.

Second, for the actual parameters of the contest, Peter's solution is sound, but with one extra twist you can speed it up by a factor of 5 with 32-bit architecture. (Or even a factor of 6 if only 1000! is desired.) Namely, instead of working with individual digits, implement multiplication in base 100000. Then at the end, total the digits within each super-digit. I don't know how good a computer you were allowed in the contest, but I have a desktop at home that is roughly as old as the contest. The following sample code takes 16 milliseconds for 1000! and 2.15 seconds for 10000! The code also ignores trailing 0s as they show up, but that only saves about 7% of the work.

#include <stdio.h>
int main() {
    unsigned int dig[10000], first=0, last=0, carry, n, x, sum=0;
    dig[0] = 1;    
    for(n=2; n <= 9999; n++) {
        carry = 0;
        for(x=first; x <= last; x++) {
            carry = dig[x]*n + carry;
            dig[x] = carry%100000;
            if(x == first && !(carry%100000)) first++;
            carry /= 100000; }
        if(carry) dig[++last] = carry; }
    for(x=first; x <= last; x++)
        sum += dig[x]%10 + (dig[x]/10)%10 + (dig[x]/100)%10 + (dig[x]/1000)%10
            + (dig[x]/10000)%10;
    printf("Sum: %d\n",sum); }

Third, there is an amazing and fairly simple way to speed up the computation by another sizable factor. With modern methods for multiplying large numbers, it does not take quadratic time to compute n!. Instead, you can do it in O-tilde(n) time, where the tilde means that you can throw in logarithmic factors. There is a simple acceleration due to Karatsuba that does not bring the time complexity down to that, but still improves it and could save another factor of 4 or so. In order to use it, you also need to divide the factorial itself into equal sized ranges. You make a recursive algorithm prod(k,n) that multiplies the numbers from k to n by the pseudocode formula

prod(k,n) = prod(k,floor((k+n)/2))*prod(floor((k+n)/2)+1,n)

Then you use Karatsuba to do the big multiplication that results.

Even better than Karatsuba is the Fourier-transform-based Schonhage-Strassen multiplication algorithm. As it happens, both algorithms are part of modern big number libraries. Computing huge factorials quickly could be important for certain pure mathematics applications. I think that Schonhage-Strassen is overkill for a programming contest. Karatsuba is really simple and you could imagine it in an A+ solution to the problem.

Part of the question posed is some speculation that there is a simple number theory trick that changes the contest problem entirely. For instance, if the question were to determine n! mod n+1, then Wilson's theorem says that the answer is -1 when n+1 is prime, and it's a really easy exercise to see that it's 2 when n=3 and otherwise 0 when n+1 is composite. There are variations of this too; for instance n! is also highly predictable mod 2n+1. There are also some connections between congruences and sums of digits. The sum of the digits of x mod 9 is also x mod 9, which is why the sum is 0 mod 9 when x = n! for n >= 6. The alternating sum of the digits of x mod 11 equals x mod 11.

The problem is that if you want the sum of the digits of a large number, not modulo anything, the tricks from number theory run out pretty quickly. Adding up the digits of a number doesn't mesh well with addition and multiplication with carries. It's often difficult to promise that the math does not exist for a fast algorithm, but in this case I don't think that there is any known formula. For instance, I bet that no one knows the sum of the digits of a googol factorial, even though it is just some number with roughly 100 digits.

This is A004152 in the Online Encyclopedia of Integer Sequences. Unfortunately, it doesn't have any useful tips about how to calculate it efficiently - its maple and mathematica recipes take the naive approach.

I'd attack the second problem, to compute N! mod (N+1), using Wilson's theorem. That reduces the problem to testing whether N is prime.

Small, fast python script found at http://www.penjuinlabs.com/blog/?p=44. It's elegant but still brute force.

import sys
for arg in sys.argv[1:]:
    print reduce( lambda x,y: int(x)+int(y), 
          str( reduce( lambda x, y: x*y, range(1,int(arg)))))

$ time python sumoffactorialdigits.py 432 951 5436 606 14 9520
3798
9639
74484
5742
27
141651

real    0m1.252s
user    0m1.108s
sys     0m0.062s

Assume you have big numbers (this is the least of your problems, assuming that N is really big, and not 10000), and let's continue from there.

The trick below is to factor N! by factoring all n<=N, and then compute the powers of the factors.

Have a vector of counters; one counter for each prime number up to N; set them to 0. For each n<= N, factor n and increase the counters of prime factors accordingly (factor smartly: start with the small prime numbers, construct the prime numbers while factoring, and remember that division by 2 is shift). Subtract the counter of 5 from the counter of 2, and make the counter of 5 zero (nobody cares about factors of 10 here).

compute all the prime number up to N, run the following loop

for (j = 0; j< last_prime; ++j) {
  count[j] = 0;
  for (i = N/ primes[j]; i; i /= primes[j])
    count[j] += i; 
}

Note that in the previous block we only used (very) small numbers.

For each prime factor P you have to compute P to the power of the appropriate counter, that takes log(counter) time using iterative squaring; now you have to multiply all these powers of prime numbers.

All in all you have about N log(N) operations on small numbers (log N prime factors), and Log N Log(Log N) operations on big numbers.

and after the improvement in the edit, only N operations on small numbers.

HTH

Windows 8 messed up my Windows 7 user folder permissions

Why is my computer not updating via Windows Update?

Is there a ready-to-run Solaris virtual machine for VMWare?

Problems with frame rate on video conversion using ffmpeg with libx264

Modern Linux UI to work with Virtual Box Seamless mode

How can I pipe the output of *all* entered shell commands into another? (e.g. pipe everything into 'lolcat')

Cisco AnyConnect Secure Mobility Client 3.1.03103. The VPN Agent service is not responding

Use IPv6 temporary address only with remote peers

Windows 7 doesn't request MTU from DHCP server; what is it thinking?