What operations and functions on +0.0 and -0.0 give different arithmetic results?
There are a few standard operations and functions that form numerically different answers between f(+0.0)
and f(-0.0)
.
Different rounding modes or other floating point implementations may give different results.
#include <math.h>
double inverse(double x) { return 1/x; }
double atan2m1(double y) { return atan2(y, -1.0); }
double sprintf_d(double x) {
char buf[20];
// sprintf(buf, "%+f", x); Changed to e
sprintf(buf, "%+e", x);
return buf[0]; // returns `+` or `-`
}
double copysign_1(double x) { return copysign(1.0, x); }
double signbit_d(double x) {
int sign = signbit(x); // my compile returns 0 or INT_MIN
return sign;
}
double pow_m1(double x) { return pow(x, -1.0); }
void zero_test(const char *name, double (*f)(double)) {
double fzp = (f)(+0.0);
double fzn = (f)(-0.0);
int differ = fzp != fzn;
if (fzp != fzp && fzn != fzn) differ = 0; // if both NAN
printf("%-15s f(+0):%-+15e %s f(-0):%-+15e\n",
name, fzp, differ ? "!=" : "==", fzn);
}
void zero_tests(void) {
zero_test("1/x", inverse);
zero_test("atan2(x,-1)", atan2m1);
zero_test("printf(\"%+e\")", sprintf_d);
zero_test("copysign(x,1)", copysign_1);
zero_test("signbit()", signbit_d);
zero_test("pow(x,-odd)", pow_m1);; // @Pascal Cuoq
zero_test("tgamma(x)", tgamma); // @vinc17 @Pascal Cuoq
}
Output:
1/x f(+0):+inf != f(-0):-inf
atan2(x,-1) f(+0):+3.141593e+00 != f(-0):-3.141593e+00
printf("%+e") f(+0):+4.300000e+01 != f(-0):+4.500000e+01
copysign(x,1) f(+0):+1.000000e+00 != f(-0):-1.000000e+00
signbit() f(+0):+0.000000e+00 != f(-0):-2.147484e+09
pow(x,-odd) f(+0):+inf != f(-0):-inf
tgamma(x) f(+0):+inf != f(-0):+inf
Notes:tgamma(x)
came up ==
on my gcc 4.8.2 machine, but correctly !=
on others.
rsqrt()
, AKA 1/sqrt()
is a maybe future C standard function. May/may not also work.
double zero = +0.0; memcpy(&zero, &x, sizeof x)
can show x
is a different bit pattern than +0.0
but x
could still be a +0.0
. I think some FP formats have many bit patterns that are +0.0
and -0.0
. TBD.
This is a self-answer as provided by https://stackoverflow.com/help/self-answer.
The IEEE 754-2008 function rsqrt
(that will be in the future ISO C standard) returns ±∞ on ±0, which is quite surprising. And tgamma
also returns ±∞ on ±0. With MPFR, mpfr_digamma
returns the opposite of ±∞ on ±0.
I think about this method, but I can't check before weekend, so someone might do some experiments on this, if he/she like, or just tell me that it is nonsense:
Generate a -0.0f. It should be possible to generate staticly by a assigning a tiny negative constant that underflows float representation.
-
Assign this constant to a volatile double and back to float.
By changing the bit representation 2 times, I assume that the compiler specific standard bit representation for -0.0f is now in the variable. The compiler can't outsmart me there, because a totally other value could be in the volatile variable between those 2 copies.
compare the input to 0.0f. To detect if we have a 0.0f/-0.0f case
-
if it is equal, assign the input volitale double variable, and then back to float.
I again assume that it has now standard compiler representation for 0.0f
access the bit patterns by a union and compare them, to decide if it is -0.0f
The code might be something like:
typedef union
{
float fvalue;
/* assuming int has at least the same number of bits as float */
unsigned int bitpat;
} tBitAccess;
float my_signf(float x)
{
/* assuming double has smaller min and
other bit representation than float */
volatile double refitbits;
tBitAccess tmp;
unsigned int pat0, patX;
if (x < 0.0f) return -1.0f;
if (x > 0.0f) return 1.0f;
refitbits = (double) (float) -DBL_MIN;
tmp.fvalue = (float) refitbits;
pat0 = tmp.bitpat;
refitbits = (double) x;
tmp.fvalue = (float) refitbits;
patX = tmp.bitpat;
return (patX == pat0)? -1.0f : 1.0f;
}
- It is not a standard function, or an operator, but a function that should differentiate between signs of -0.0 and 0.0.
- It based (mainly) on the assumption that the compiler vendor does not use different bit patterns for -0.0f as result of changing of formats, even if the floating point format would allow it, and if this holds, it is independent from the chosen bit pattern.
- For a floating point formats that have exact one pattern for -0.0f this function should safely do the trick without knowledge of the bit ordering in that pattern.
- The other assumptions (about size of the types and so on) can be handled with precompiler switches on the float.h constants.
Edit: On a second thought: If we can force the value comparing to (0.0 || -0.0) below the smallest representable denormal (subnormal) floating point number or its negative counterpart, and there is no second pattern for -0.0f (exact) in the FP format, we could drop the casting to volatile double. (But maybe keep the float volatile, to ensure that with deactivated denormals the compiler can't do any fancy trick, to ignore operations, that reduce any further the absolut value of things comparing equal to 0.0.)
The Code then might look like:
typedef union
{
float fvalue;
/* assuming int has at least the same number of bits as float */
unsigned int bitpat;
} tBitAccess;
float my_signf(float x)
{
volatile tBitAccess tmp;
unsigned int pat0, patX;
if (x < 0.0f) return -1.0f;
if (x > 0.0f) return 1.0f;
tmp.fvalue = -DBL_MIN;
/* forcing something compares equal to 0.0f below smallest subnormal
- not sure if one abs()-factor is enough */
tmp.fvalue = tmp.fvalue * fabsf(tmp.fvalue);
pat0 = tmp.bitpat;
tmp.fvalue = x;
tmp.fvalue = tmp.fvalue * fabsf(tmp.fvalue);
patX = tmp.bitpat;
return (patX == pat0)? -1.0f : 1.0f;
}
This might not work with fancy rounding methods, that prevent rounding from negative values towards -0.0.
Not exactly an answer to the question, but can be useful to know:
Just faced the case
a - b = c => b = a - c
, which fails to hold ifa
is0.0
andb
is-0.0
. We have0.0 - (-0.0) = 0.0 => b = 0.0 - 0.0 = 0.0
. The sign is lost. The-0.0
is not recovered.
Taken from here.