Use of min and max functions in C++
From C++, are std::min
and std::max
preferable over fmin
and fmax
? For comparing two integers, do they provide basically the same functionality?
Do you tend to use one of these sets of functions or do you prefer to write your own (perhaps to improve efficiency, portability, flexibility, etc.)?
Notes:
-
The C++ Standard Template Library (STL) declares the
min
andmax
functions in the standard C++ algorithm header. -
The C standard (C99) provides the
fmin
andfmax
function in the standard C math.h header.
Thanks in advance!
Solution 1:
fmin
and fmax
are specifically for use with floating point numbers (hence the "f"). If you use it for ints, you may suffer performance or precision losses due to conversion, function call overhead, etc. depending on your compiler/platform.
std::min
and std::max
are template functions (defined in header <algorithm>
) which work on any type with a less-than (<
) operator, so they can operate on any data type that allows such a comparison. You can also provide your own comparison function if you don't want it to work off <
.
This is safer since you have to explicitly convert arguments to match when they have different types. The compiler won't let you accidentally convert a 64-bit int into a 64-bit float, for example. This reason alone should make the templates your default choice. (Credit to Matthieu M & bk1e)
Even when used with floats the template may win in performance. A compiler always has the option of inlining calls to template functions since the source code is part of the compilation unit. Sometimes it's impossible to inline a call to a library function, on the other hand (shared libraries, absence of link-time optimization, etc.).
Solution 2:
There is an important difference between std::min
, std::max
and fmin
and fmax
.
std::min(-0.0,0.0) = -0.0
std::max(-0.0,0.0) = -0.0
whereas
fmin(-0.0, 0.0) = -0.0
fmax(-0.0, 0.0) = 0.0
So std::min
is not a 1-1 substitute for fmin
. The functions std::min
and std::max
are not commutative. To get the same result with doubles with fmin
and fmax
one should swap the arguments
fmin(-0.0, 0.0) = std::min(-0.0, 0.0)
fmax(-0.0, 0.0) = std::max( 0.0, -0.0)
But as far as I can tell all these functions are implementation defined anyway in this case so to be 100% sure you have to test how they are implemented.
There is another important difference. For x ! = NaN
:
std::max(Nan,x) = NaN
std::max(x,NaN) = x
std::min(Nan,x) = NaN
std::min(x,NaN) = x
whereas
fmax(Nan,x) = x
fmax(x,NaN) = x
fmin(Nan,x) = x
fmin(x,NaN) = x
fmax
can be emulated with the following code
double myfmax(double x, double y)
{
// z > nan for z != nan is required by C the standard
int xnan = isnan(x), ynan = isnan(y);
if(xnan || ynan) {
if(xnan && !ynan) return y;
if(!xnan && ynan) return x;
return x;
}
// +0 > -0 is preferred by C the standard
if(x==0 && y==0) {
int xs = signbit(x), ys = signbit(y);
if(xs && !ys) return y;
if(!xs && ys) return x;
return x;
}
return std::max(x,y);
}
This shows that std::max
is a subset of fmax
.
Looking at the assembly shows that Clang uses builtin code for fmax
and fmin
whereas GCC calls them from a math library. The assembly for clang for fmax
with -O3
is
movapd xmm2, xmm0
cmpunordsd xmm2, xmm2
movapd xmm3, xmm2
andpd xmm3, xmm1
maxsd xmm1, xmm0
andnpd xmm2, xmm1
orpd xmm2, xmm3
movapd xmm0, xmm2
whereas for std::max(double, double)
it is simply
maxsd xmm0, xmm1
However, for GCC and Clang using -Ofast
fmax
becomes simply
maxsd xmm0, xmm1
So this shows once again that std::max
is a subset of fmax
and that when you use a looser floating point model which does not have nan
or signed zero then fmax
and std::max
are the same. The same argument obviously applies to fmin
and std::min
.
Solution 3:
You're missing the entire point of fmin and fmax. It was included in C99 so that modern CPUs could use their native (read SSE) instructions for floating point min and max and avoid a test and branch (and thus a possibly mis-predicted branch). I've re-written code that used std::min and std::max to use SSE intrinsics for min and max in inner loops instead and the speed-up was significant.