How to ask Clang++ not to cache function result during -O3 optimization?
Solution 1:
Out of curiosity, can you share why you would want to disable that optimization?
Anyway, about your question:
In your example code, s
is never read after the loop, so the compiler would throw the whole loop away. So let's assume that s
is used after the loop.
I'm not aware of any pragmas or compiler options to disable a particular optimization in a particular section of code.
Is changing the code an option? To prevent that optimization in a portable manner, you can look for a creative way to compute the function call argument in a way such that the compiler is no longer able to treat the argument as constant. Of course the challenge here is to actually use a trick that does not rely on undefined behavior and that cannot be "outsmarted" by a newer compiler version.
See the commented example below.
- pro: you use a trick that uses only the language that you can apply selectively
- con: you get an additional memory access in every loop iteration; however, the access will be satisfied by your CPU cache most of the time
I verified the generated assembly for your particular example with clang++ -O3 -S
. The compiler now generates your loop and no longer caches the result. However, the function gets inlined. If you want to prevent that as well, you can declare foo
with __attribute__((noinline))
, for example.
int foo(int x) {
return x + 1; // I have more complex code here
}
volatile int dummy = 0; // initialized to 0 and never changed
int main() {
int s = 0;
for (int i = 0; i < 1000000; ++i) {
// Because of the volatile variable, the compiler is forced to assume
// that the function call argument is different for each loop
// iteration and it is no longer able to use a cached result.
s += foo(42 + dummy);
}
}