Why does Clang optimize away x * 1.0 but NOT x + 0.0?
Why does Clang optimize away the loop in this code
#include <time.h>
#include <stdio.h>
static size_t const N = 1 << 27;
static double arr[N] = { /* initialize to zero */ };
int main()
{
clock_t const start = clock();
for (int i = 0; i < N; ++i) { arr[i] *= 1.0; }
printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);
}
but not the loop in this code?
#include <time.h>
#include <stdio.h>
static size_t const N = 1 << 27;
static double arr[N] = { /* initialize to zero */ };
int main()
{
clock_t const start = clock();
for (int i = 0; i < N; ++i) { arr[i] += 0.0; }
printf("%u ms\n", (unsigned)(clock() - start) * 1000 / CLOCKS_PER_SEC);
}
(Tagging as both C and C++ because I would like to know if the answer is different for each.)
The IEEE 754-2008 Standard for Floating-Point Arithmetic and the ISO/IEC 10967 Language Independent Arithmetic (LIA) Standard, Part 1 answer why this is so.
IEEE 754 § 6.3 The sign bit
When either an input or result is NaN, this standard does not interpret the sign of a NaN. Note, however, that operations on bit strings — copy, negate, abs, copySign — specify the sign bit of a NaN result, sometimes based upon the sign bit of a NaN operand. The logical predicate totalOrder is also affected by the sign bit of a NaN operand. For all other operations, this standard does not specify the sign bit of a NaN result, even when there is only one input NaN, or when the NaN is produced from an invalid operation.
When neither the inputs nor result are NaN, the sign of a product or quotient is the exclusive OR of the operands’ signs; the sign of a sum, or of a difference x − y regarded as a sum x + (−y), differs from at most one of the addends’ signs; and the sign of the result of conversions, the quantize operation, the roundTo-Integral operations, and the roundToIntegralExact (see 5.3.1) is the sign of the first or only operand. These rules shall apply even when operands or results are zero or infinite.
When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0 in all rounding-direction attributes except roundTowardNegative; under that attribute, the sign of an exact zero sum (or difference) shall be −0. However, x + x = x − (−x) retains the same sign as x even when x is zero.
The Case of Addition
Under the default rounding mode (Round-to-Nearest, Ties-to-Even), we see that x+0.0
produces x
, EXCEPT when x
is -0.0
: In that case we have a sum of two operands with opposite signs whose sum is zero, and §6.3 paragraph 3 rules this addition produces +0.0
.
Since +0.0
is not bitwise identical to the original -0.0
, and that -0.0
is a legitimate value that may occur as input, the compiler is obliged to put in the code that will transform potential negative zeros to +0.0
.
The summary: Under the default rounding mode, in x+0.0
, if x
-
is not
-0.0
, thenx
itself is an acceptable output value. -
is
-0.0
, then the output value must be+0.0
, which is not bitwise identical to-0.0
.
The Case of Multiplication
Under the default rounding mode, no such problem occurs with x*1.0
. If x
:
- is a (sub)normal number,
x*1.0 == x
always. - is
+/- infinity
, then the result is+/- infinity
of the same sign. -
is
NaN
, then according toIEEE 754 § 6.2.3 NaN Propagation
An operation that propagates a NaN operand to its result and has a single NaN as an input should produce a NaN with the payload of the input NaN if representable in the destination format.
which means that the exponent and mantissa (though not the sign) of
NaN*1.0
are recommended to be unchanged from the inputNaN
. The sign is unspecified in accordance with §6.3p1 above, but an implementation may specify it to be identical to the sourceNaN
. - is
+/- 0.0
, then the result is a0
with its sign bit XORed with the sign bit of1.0
, in agreement with §6.3p2. Since the sign bit of1.0
is0
, the output value is unchanged from the input. Thus,x*1.0 == x
even whenx
is a (negative) zero.
The Case of Subtraction
Under the default rounding mode, the subtraction x-0.0
is also a no-op, because it is equivalent to x + (-0.0)
. If x
is
- is
NaN
, then §6.3p1 and §6.2.3 apply in much the same way as for addition and multiplication. - is
+/- infinity
, then the result is+/- infinity
of the same sign. - is a (sub)normal number,
x-0.0 == x
always. - is
-0.0
, then by §6.3p2 we have "[...] the sign of a sum, or of a difference x − y regarded as a sum x + (−y), differs from at most one of the addends’ signs;". This forces us to assign-0.0
as the result of(-0.0) + (-0.0)
, because-0.0
differs in sign from none of the addends, while+0.0
differs in sign from two of the addends, in violation of this clause. - is
+0.0
, then this reduces to the addition case(+0.0) + (-0.0)
considered above in The Case of Addition, which by §6.3p3 is ruled to give+0.0
.
Since for all cases the input value is legal as the output, it is permissible to consider x-0.0
a no-op, and x == x-0.0
a tautology.
Value-Changing Optimizations
The IEEE 754-2008 Standard has the following interesting quote:
IEEE 754 § 10.4 Literal meaning and value-changing optimizations
[...]
The following value-changing transformations, among others, preserve the literal meaning of the source code:
- Applying the identity property 0 + x when x is not zero and is not a signaling NaN and the result has the same exponent as x.
- Applying the identity property 1 × x when x is not a signaling NaN and the result has the same exponent as x.
- Changing the payload or sign bit of a quiet NaN.
- [...]
Since all NaNs and all infinities share the same exponent, and the correctly rounded result of x+0.0
and x*1.0
for finite x
has exactly the same magnitude as x
, their exponent is the same.
sNaNs
Signaling NaNs are floating-point trap values; They are special NaN values whose use as a floating-point operand results in an invalid operation exception (SIGFPE). If a loop that triggers an exception were optimized out, the software would no longer behave the same.
However, as user2357112 points out in the comments, the C11 Standard explicitly leaves undefined the behaviour of signaling NaNs (sNaN
), so the compiler is allowed to assume they do not occur, and thus that the exceptions that they raise also do not occur. The C++11 standard omits describing a behaviour for signaling NaNs, and thus also leaves it undefined.
Rounding Modes
In alternate rounding modes, the permissible optimizations may change. For instance, under Round-to-Negative-Infinity mode, the optimization x+0.0 -> x
becomes permissible, but x-0.0 -> x
becomes forbidden.
To prevent GCC from assuming default rounding modes and behaviours, the experimental flag -frounding-math
can be passed to GCC.
Conclusion
Clang and GCC, even at -O3
, remains IEEE-754 compliant. This means it must keep to the above rules of the IEEE-754 standard. x+0.0
is not bit-identical to x
for all x
under those rules, but x*1.0
may be chosen to be so: Namely, when we
- Obey the recommendation to pass unchanged the payload of
x
when it is a NaN. - Leave the sign bit of a NaN result unchanged by
* 1.0
. - Obey the order to XOR the sign bit during a quotient/product, when
x
is not a NaN.
To enable the IEEE-754-unsafe optimization (x+0.0) -> x
, the flag -ffast-math
needs to be passed to Clang or GCC.
x += 0.0
isn't a NOOP if x
is -0.0
. The optimizer could strip out the whole loop anyway since the results aren't used, though. In general, it's hard to tell why an optimizer makes the decisions it does.