Argument order to std::min changes compiler output for floating-point
I was fiddling in Compiler Explorer, and I found that the order of arguments passed to std::min changes the emitted assembly.
Here's the example on Godbolt Compiler Explorer
double std_min_xy(double x, double y) {
return std::min(x, y);
}
double std_min_yx(double x, double y) {
return std::min(y, x);
}
This is compiled (with -O3 on clang 9.0.0, for example), to:
std_min_xy(double, double): # @std_min_xy(double, double)
minsd xmm1, xmm0
movapd xmm0, xmm1
ret
std_min_yx(double, double): # @std_min_yx(double, double)
minsd xmm0, xmm1
ret
This persists if I change the std::min to an old-school ternary operator. It also persists across all the modern compilers I tried out (clang, gcc, icc).
The underlying instruction is minsd
. Reading the documentation, the first argument of minsd
is also the destination for the answer. Apparently xmm0 is where my function is supposed to put its return value, so if xmm0 is used as the first argument, there is no movapd
needed. But if xmm0 is the second argument, then it has to movapd xmm0, xmm1
to get the value into xmm0. (editor's note: yes, x86-64 System V passes FP args in xmm0, xmm1, etc., and returns in xmm0.)
My question: why doesn't the compiler switch the order of the arguments itself, so that this movapd
isn't necessary? It surely must know that the order of arguments to minsd does not change the answer? Is there some side-effect that I'm not appreciating?
minsd a,b
is not commutative for some special FP values, and neither is std::min
, unless you use -ffast-math
.
minsd a,b
exactly implements (a<b) ? a : b
including everything that implies about signed-zero and NaN in strict IEEE-754 semantics. (i.e. it keeps the source operand, b
, on unordered1 or equal). As Artyer points out, -0.0
and +0.0
compare equal (i.e. -0. < 0.
is false), but they are distinct.
std::min
is defined in terms of an (a<b)
comparison expression (cppreference), with (a<b) ? a : b
as a possible implementation, unlike std::fmin
which guarantees NaN propagation from either operand, among other things. (fmin
originally came from the C math library, not a C++ template.)
See What is the instruction that gives branchless FP min and max on x86? for much more detail about minss/minsd / maxss/maxsd (and the corresponding intrinsics, which follow the same non-commutative rules except in some GCC versions.)
Footnote 1: Remember that NaN<b
is false for any b
, and for any comparison predicate. e.g. NaN == b
is false, and so is NaN > b
. Even NaN == NaN
is false. When one or more of a pair are NaN, they are "unordered" wrt. each other.
With -ffast-math
(to tell the compiler to assume no NaNs, and other assumptions and approximations), compilers will optimize either function to a single minsd
. https://godbolt.org/z/a7oK91
For GCC, see https://gcc.gnu.org/wiki/FloatingPointMath
clang supports similar options, including -ffast-math
as a catch-all.
Some of those options should be enabled by almost everyone, except for weird legacy codebases, e.g. -fno-math-errno
. (See this Q&A for more about recommended math optimizations). And gcc -fno-trapping-math
is a good idea because it doesn't fully work anyway, despite being on by default (some optimizations can still change the number of FP exceptions that would be raised if exceptions were unmasked, including sometimes even from 1 to 0 or 0 to non-zero, IIRC). gcc -ftrapping-math
also blocks some optimizations that are 100% safe even wrt. exception semantics, so it's pretty bad. In code that doesn't use fenv.h
, you'll never know the difference.
But treating std::min
as commutative can only be accomplished with options that assume no NaNs, and stuff like that, so definitely can't be called "safe" for code that cares about exactly what happens with NaNs. e.g. -ffinite-math-only
assumes no NaNs (and no infinities)
clang -funsafe-math-optimizations -ffinite-math-only
will do the optimization you're looking for. (unsafe-math-optimizations implies a bunch of more specific options, including not caring about signed zero semantics).
Consider: std::signbit(std::min(+0.0, -0.0)) == false && std::signbit(std::min(-0.0, +0.0)) == true
.
The only other difference is if both arguments are (possibly different) NaNs, the second argument should be returned.
You can allow gcc to reorder the arguments by using the -funsafe-math-optimizations -fno-math-errno
optimsations (Both enabled by -ffast-math
). unsafe-math-optimizations
allows the compiler to not care about signed zero, and finite-math-only
to not care about NaNs
To expand on the existing answers that say std::min
isn't commutative: Here's a concrete example that reliably distinguishes std_min_xy
from std_min_yx
. Godbolt:
bool distinguish1() {
return 1 / std_min_xy(0.0, -0.0) > 0.0;
}
bool distinguish2() {
return 1 / std_min_yx(0.0, -0.0) > 0.0;
}
distinguish1()
evaluates to 1 / 0.0 > 0.0
, i.e. INFTY > 0.0
, or true
.distinguish2()
evaluates to 1 / -0.0 > 0.0
, i.e. -INFTY > 0.0
, or false
.
(All this under IEEE rules, of course. I don't think the C++ standard mandates that compilers preserve this particular behavior. Honestly I was surprised that the expression -0.0
actually evaluated to a negative zero in the first place!
-ffinite-math-only
eliminates this way of telling the difference, and -ffinite-math-only -funsafe-math-optimizations
completely eliminates the difference in codegen.