How can the compile-time be (exponentially) faster than run-time?

GCC is likely memoizing constexpr functions (enabling a Θ(n) computation of fib(n)). That is safe for the compiler to do because constexpr functions are purely functional.

Compare the Θ(n) "compiler algorithm" (using memoization) to your Θ(φⁿ) run time algorithm (where φ is the golden ratio) and suddenly it makes perfect sense that the compiler is so much faster.

From the constexpr page on cppreference (emphasis added):

The constexpr specifier declares that it is possible to evaluate the value of the function or variable at compile time.

The constexpr specifier does not declare that it is required to evaluate the value of the function or variable at compile time. So one can only guess what heuristics GCC is using to choose whether to evaluate at compile time or run time when a compile time computation is not required by language rules. It can choose either, on a case-by-case basis, and still be correct.

If you want to force the compiler to evaluate your constexpr function at compile time, here's a simple trick that will do it.

constexpr auto compute_fib(const size_t n) -> long long
{
    return n < 2 ? n : compute_fib(n - 1) + compute_fib(n - 2);
}

template <std::size_t N>
struct fib
{
    static_assert(N >= 0, "N must be nonnegative.");
    static const long long value = compute_fib(N);
};

In the rest of your code you can then access fib<45>::value or fib<91>::value with the guarantee that they'll be evaluated at compile time.

At compile-time the compiler can memoize the result of the function. This is safe, because the function is a constexpr and hence will always return the same result of the same inputs.

At run-time it could in theory do the same. However most C++ programmers would frown at optimization passes that result in hidden memory allocations.

When you ask for fib(91) to give a value to your const fib91 in the source code, the compiler is forced to compute that value from you const expr. It does not compile the function (as you seem to think), just it sees that to compute fib91 it needs fib(90) and fib(89), to compute the it needs fib(87)... so on until he computes fib(1) which is given. This is an $O(n)$ algorithm and the result is computed fast enough.

However when you ask to evaluate fib(45) in runtime the compiler has to choose wether using the actual function call or precompute the result. Eventually it decides to use the compiled function. Now, the compiled function must execute exactly the exponential algorithm that you have decided there is no way the compiler could implement memoization to optimize a recursive function (think about the need to allocate some cache and to understand how many values to keep and how to manage them between function calls).

How can the compile-time be (exponentially) faster than run-time?

Related

Recent Posts