Fast exponentiation algorithm - How to arrive at it?

I've been learning about fast exponentiation when I found this algorithm:

int expo(int a, int b) {
  int result = 1;

  while (b) {
    if (b%2==1) {
      result *= a;
    }
    b /= 2;
    a *= a;
  }

  return result;
}

This is supposed to compute $a^b$ in logarithmic time, but I couldn't understand or find anywhere a way to arrive at this procedure, other than by noting that (this particular formula didn't help me in understanding why the algorithm is correct):

$$ x^n = \begin{cases} 1 & \text{if $n=0$} \\ \frac{1}{x}^{-n} & \text{if $n<0$} \\ x \cdot \left( x^{\frac{n-1}{2}} \right)^2 & \text{if $n$ is odd} \\ \left( x^{\frac{n}{2}} \right)^2 & \text{if $n$ is even} \end{cases} $$

From what I understand, the transition from this particular identity to the actual algorithm is quite obvious, but I honestly don't get it and I've worked by hand quite a few examples for this algorithm.

Can anyone explain?


Solution 1:

Write $b = b_0 + b_1 2 + b_2 2^2 + ... + b_n 2^n$ where $b_k$ are the binary digits of the representation of $b$ in base $2$. Note that $n$ varies logarithmically with $b$. Then:

$$ a^b = a^{b_0} \cdot (a^2)^{b_1} \cdot (a^{2^2})^{b_2} \cdot ... \cdot (a^{2^n})^{b_n} $$

The terms where bits $b_k = 0$ evaluate to $1$ and can be skipped. The terms with $b_k = 1$ are of the form $a^{2^k}$ and can be calculated by repeatedly squaring $a$ $k$ times. This is precisely what the posted code does.

Solution 2:

The algorithm uses the binary expansion of the exponent to reduce the number of multiplications one has to do. If you take $a$ and square it and then square it again and then square it again, you produce the numbers $a, a^2, a^4, a^8,\ldots,a^{2^k}$ until $2^{k+1}>b$ is So that takes about $\log_2 b$ multiplications. Then if you express $b$ in binary, say, $b=1101001$, then $a^b = a^{2^6+2^5+2^3+2^0} = a^{2^6}a^{2^5}a^{2^3}a^{2^0}$, and you've just computed all these powers of $a$, so multiply them together and that's about $\log_2 b$ more multiplications.

Solution 3:

I hardly understand you code, but here is description of the algorithm in pseudo-code: we consider the successive values of the squares, denoted by $S$, and by $P$ the successive values of the powers of the number $a$:

Input: $\;a, n$

Output: $\;a^n$

$S\leftarrow a$, $P\leftarrow a$ ;

While $n>1$ do

$\qquad$If odd($n$) $\;P\leftarrow P*S\enspace$ fi \begin{align*} &n\leftarrow \Bigl\lfloor n/2\Bigr\rfloor&\hspace25em\\ &S\leftarrow S^2 \end{align*}

Endwhile

$ P\leftarrow P*S$.

End

Solution 4:

If you want to do something in logarithmic time, you probably need to do something with the digits of the number in some base, rather than with the number itself. The digital representation of a number is inherently a logarithmy object. Additionally, since $b$ is the main thing that determines how big $a^b$ is, we're probably going to need to think of $b$ in a logarithmy way, or we're not going to have any hope of making our algorithm go logarithmic.

If you now consider $b$ in base $2$, this is the natural algorithm that falls out.

One may also be inspired by the binary search algorithm, which is to find an element in a sorted list. It works by dividing the list into two, determining whether the element is in the first (resp. second) half of the list, and repeating the algorithm using the now-half-sized list which is the first (resp. second) half of the original. This "divide and conquer" idea turns up all over the place if we want to do things in logarithmic time; the algorithm you have given is a natural answer to the question "can we divide-and-conquer this?".