Is this floating-point optimization allowed?

Solution 1:

Note that the built-in operator != requires its operands to be of the same type, and will achieve that using promotions and conversions if necessary. In other words, your condition is equivalent to:

(float)i != (float)i

That should never fail, and so the code will eventually overflow i, giving your program Undefined Behaviour. Any behaviour is therefore possible.

To correctly check what you want to check, you should cast the result back to int:

if ((int)(float)i != i)

Solution 2:

As @Angew pointed out, the != operator needs the same type on both sides. (float)i != i results in promotion of the RHS to float as well, so we have (float)i != (float)i.

g++ also generates an infinite loop, but it doesn't optimize away the work from inside it. You can see it converts int->float with cvtsi2ss and does ucomiss xmm0,xmm0 to compare (float)i with itself. (That was your first clue that your C++ source doesn't mean what you thought it did like @Angew's answer explains.)

x != x is only true when it's "unordered" because x was NaN. (INFINITY compares equal to itself in IEEE math, but NaN doesn't. NAN == NAN is false, NAN != NAN is true).

gcc7.4 and older correctly optimizes your code to jnp as the loop branch (https://godbolt.org/z/fyOhW1) : keep looping as long as the operands to x != x weren't NaN. (gcc8 and later also checks je to a break out of the loop, failing to optimize based on the fact that it will always be true for any non-NaN input). x86 FP compares set PF on unordered.

And BTW, that means clang's optimization is also safe: it just has to CSE (float)i != (implicit conversion to float)i as being the same, and prove that i -> float is never NaN for the possible range of int.

(Although given that this loop will hit signed-overflow UB, it's allowed to emit literally any asm it wants, including a ud2 illegal instruction, or an empty infinite loop regardless of what the loop body actually was.) But ignoring the signed-overflow UB, this optimization is still 100% legal.

GCC fails to optimize away the loop body even with -fwrapv to make signed-integer overflow well-defined (as 2's complement wraparound). https://godbolt.org/z/t9A8t_

Even enabling -fno-trapping-math doesn't help. (GCC's default is unfortunately to enable
-ftrapping-math even though GCC's implementation of it is broken/buggy.) int->float conversion can cause an FP inexact exception (for numbers too large to be represented exactly), so with exceptions possibly unmasked it's reasonable not to optimize away the loop body. (Because converting 16777217 to float could have an observable side-effect if the inexact exception is unmasked.)

But with -O3 -fwrapv -fno-trapping-math, it's 100% missed optimization not to compile this to an empty infinite loop. Without #pragma STDC FENV_ACCESS ON, the state of the sticky flags that record masked FP exceptions is not an observable side-effect of the code. No int->float conversion can result in NaN, so x != x can't be true.

These compilers are all optimizing for C++ implementations that use IEEE 754 single-precision (binary32) float and 32-bit int.

The bugfixed (int)(float)i != i loop would have UB on C++ implementations with narrow 16-bit int and/or wider float, because you'd hit signed-integer overflow UB before reaching the first integer that wasn't exactly representable as a float.

But UB under a different set of implementation-defined choices doesn't have any negative consequences when compiling for an implementation like gcc or clang with the x86-64 System V ABI.

BTW, you could statically calculate the result of this loop from FLT_RADIX and FLT_MANT_DIG, defined in <climits>. Or at least you can in theory, if float actually fits the model of an IEEE float rather than some other kind of real-number representation like a Posit / unum.

I'm not sure how much the ISO C++ standard nails down about float behaviour and whether a format that wasn't based on fixed-width exponent and significand fields would be standards compliant.

In comments:

@geza I would be interested to hear the resulting number!

@nada: it's 16777216

Are you claiming you got this loop to print / return 16777216?

Update: since that comment has been deleted, I think not. Probably the OP is just quoting the float before the first integer that can't be exactly represented as a 32-bit float. https://en.wikipedia.org/wiki/Single-precision_floating-point_format#Precision_limits_on_integer_values i.e. what they were hoping to verify with this buggy code.

The bugfixed version would of course print 16777217, the first integer that's not exactly representable, rather than the value before that.

(All the higher float values are exact integers, but they're multiples of 2, then 4, then 8, etc. for exponent values higher than the significand width. Many higher integer values can be represented, but 1 unit in the last place (of the significand) is greater than 1 so they're not contiguous integers. The largest finite float is just below 2^128, which is too large for even int64_t.)

If any compiler did exit the original loop and print that, it would be a compiler bug.

Is this floating-point optimization allowed?

Solution 1:

Solution 2:

Related

Recent Posts