Is it legal for a C++ optimizer to reorder calls to clock()?

The C++ Programming Language 4th edition, page 225 reads: A compiler may reorder code to improve performance as long as the result is identical to that of the simple order of execution. Some compilers, e.g. Visual C++ in release mode, will reorder this code:

#include <time.h>
...
auto t0 = clock();
auto r  = veryLongComputation();
auto t1 = clock();

std::cout << r << "  time: " << t1-t0 << endl;

into this form:

auto t0 = clock();
auto t1 = clock();
auto r  = veryLongComputation();

std::cout << r << "  time: " << t1-t0 << endl;

which guarantees different result than original code (zero vs. greater than zero time reported). See my other question for detailed example. Is this behavior compliant with the C++ standard?


Solution 1:

Well, there is something called Subclause 5.1.2.3 of the C Standard [ISO/IEC 9899:2011] which states:

In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).

Therefore I really suspect that this behaviour - the one you described - is compliant with the standard.

Furthermore - the reorganization indeed has an impact on the computation result, but if you look at it from compiler perspective - it lives in the int main() world and when doing time measurements - it peeps out, asks the kernel to give it the current time, and goes back into the main world where the actual time of the outside world doesn't really matter. The clock() itself won't affect the program and variables and program behaviour won't affect that clock() function.

The clocks values are used to calculate difference between them - that is what you asked for. If there is something going on, between the two measuring, is not relevant from compilers perspective since what you asked for was clock difference and the code between the measuring won't affect the measuring as a process.

This however doesn't change the fact that the described behaviour is very unpleasant.

Even though inaccurate measurements are unpleasant, it could get much more worse and even dangerous.

Consider the following code taken from this site:

void GetData(char *MFAddr) {
    char pwd[64];
    if (GetPasswordFromUser(pwd, sizeof(pwd))) {
        if (ConnectToMainframe(MFAddr, pwd)) {
              // Interaction with mainframe
        }
    }
    memset(pwd, 0, sizeof(pwd));
}

When compiled normally, everything is OK, but if optimizations are applied, the memset call will be optimized out which may result in a serious security flaw. Why does it get optimized out? It is very simple; the compiler again thinks in its main() world and considers the memset to be a dead store since the variable pwd is not used afterwards and won't affect the program itself.

Solution 2:

The compiler cannot exchange the two clock calls. t1 must be set after t0. Both calls are observable side effects. The compiler may reorder anything between those observable effects, and even over an observable side effect, as long as the observations are consistent with possible observations of an abstract machine.

Since the C++ abstract machine is not formally restricted to finite speeds, it could execute veryLongComputation() in zero time. Execution time itself is not defined as an observable effect. Real implementations may match that.

Mind you, a lot of this answer depends on the C++ standard not imposing restrictions on compilers.

Solution 3:

Yes, it is legal - if the compiler can see the entirety of the code that occurs between the clock() calls.

Solution 4:

If veryLongComputation() internally performs any opaque function call, then no, because the compiler cannot guarantee that its side effects would be interchangeable with those of clock().

Otherwise, yes, it is interchangeable.
This is the price you pay for using a language in which time isn't a first-class entity.

Note that memory allocation (such as new) can fall in this category, as allocation function can be defined in a different translation unit and not compiled until the current translation unit is already compiled. So, if you merely allocate memory, the compiler is forced to treat the allocation and deallocation as worst-case barriers for everything -- clock(), memory barriers, and everything else -- unless it already has the code for the memory allocator and can prove that this is not necessary. In practice I don't think any compiler actually looks at the allocator code to try to prove this, so these types of function calls serve as barriers in practice.