How does the C++ compiler evaluate recursive constexpr functions so quickly?
I've been learning about C++ constexpr
functions, and I implemented a constexpr
recursive function to find the nth fibonacci number.
#include <iostream>
#include <fstream>
#include <cmath>
#include <algorithm>
#include <vector>
constexpr long long fibonacci(int num) {
if (num <= 2) return 1;
return fibonacci(num - 1) + fibonacci(num - 2);
}
int main() {
auto start = clock();
long long num = fibonacci(70);
auto duration = (clock() - start) / (CLOCKS_PER_SEC / 1000.);
std::cout << num << "\n" << duration << std::endl;
}
If I remove the constexpr
identifier from the fibonacci()
function, then fibonacci(70)
takes a very long time to evaluate (more than 5 minutes). When I keep it as-is, however, the program still compiles within 3 seconds and prints out the correct result in less than 0.1
milliseconds.
I've learned that constexpr
functions are evaluated at compile time, so this would mean that fibonacci(70)
is calculated by the compiler in less than 3 seconds! However, it doesn't seem quite right that the C++ compiler would have such better calculation performance than C++ code.
My question is, does the C++ compiler actually evaluate the function between the time I press the "Build" button and the time the compilation finishes? Or am I misunderstanding the keyword constexpr
?
EDIT: This program was compiled with g++ 7.5.0
with --std=c++17
.
constexpr
functions have no side-effects and can thus be memoized without worry. Given the disparity in runtime the simplest explanation is that the compiler memoizes constexpr functions during compile-time. This means that fibonacci(n)
is only computed once for each n
, and all other recursive calls get returned from a lookup table.
To add some details to what other's pointed out: constexpr
function doesn't have to be computed at runtime and one of the parameters that can affect it is -fconstexpr-ops-limit
.
On GCC 10.2.0, -fconstexpr-ops-limit=1000000000
(1B) and fibonacci(40)
results in a pre-compiled value, but if you drop the limit to 10000000 (10M) then function is computed at run-time. If you want to make sure the value is always computed at compile time, you need to mark long long num
as constexpr
in addition to the fibonacci
function.
Note: the opposite example would be a non-constexpr function computed at compile time (optimized out) and marking it with __attribute__ ((const))
might help compiler make such decision. However, my compiler didn't optimize it out.
In g++ 8.3.0, if you use constexpr
in this case, it computes the value you are using and outputs the result as a constant. This is even without optimizations:
//#include <iostream>
constexpr long long fibonacci(int num){
if(num <= 2){return 1;}
return fibonacci(num - 1) + fibonacci(num - 2);
}
int main(int argc, char** argv)
{
//double start = clock();
long long num = fibonacci(70);
//std::cout << num << std::endl;
//cout << (clock()-start)/(CLOCKS_PER_SEC/1000) << endl;
return 0;
}
.file "constexpr.cc"
.text
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movabsq $190392490709135, %rax
movq %rax, -8(%rbp)
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Debian 8.3.0-6) 8.3.0"
.section .note.GNU-stack,"",@progbits
I was wondering why there is so huge difference between the code and the compiler in terms of execution time.
It seems it computes it without recursion. With recursion is just too slow.
What surprises me is that it can convert a recursive function into a iterative one, even without optimization, at compile time. At least that's what it seems.