Is it possible to get 0 by subtracting two unequal floating point numbers?
Is it possible to get division by 0 (or infinity) in the following example?
public double calculation(double a, double b)
{
if (a == b)
{
return 0;
}
else
{
return 2 / (a - b);
}
}
In normal cases it will not, of course. But what if a
and b
are very close, can (a-b)
result in being 0
due to precision of the calculation?
Note that this question is for Java, but I think it will apply to most programming languages.
In Java, a - b
is never equal to 0
if a != b
. This is because Java mandates IEEE 754 floating point operations which support denormalized numbers. From the spec:
In particular, the Java programming language requires support of IEEE 754 denormalized floating-point numbers and gradual underflow, which make it easier to prove desirable properties of particular numerical algorithms. Floating-point operations do not "flush to zero" if the calculated result is a denormalized number.
If an FPU works with denormalized numbers, subtracting unequal numbers can never produce zero (unlike multiplication), also see this question.
For other languages, it depends. In C or C++, for example, IEEE 754 support is optional.
That said, it is possible for the expression 2 / (a - b)
to overflow, for example with a = 5e-308
and b = 4e-308
.
As a workaround, what about the following?
public double calculation(double a, double b) {
double c = a - b;
if (c == 0)
{
return 0;
}
else
{
return 2 / c;
}
}
That way you don't depend on IEEE support in any language.
You wouldn't get a division by zero regardless of the value of a - b
, since floating point division by 0 doesn't throw an exception. It returns infinity.
Now, the only way a == b
would return true is if a
and b
contain the exact same bits. If they differ by just the least significant bit, the difference between them will not be 0.
EDIT :
As Bathsheba correctly commented, there are some exceptions:
"Not a number compares" false with itself but will have identical bit patterns.
-0.0 is defined to compare true with +0.0, and their bit patterns are different.
So if both a
and b
are Double.NaN
, you will reach the else clause, but since NaN - NaN
also returns NaN
, you will not be dividing by zero.
There is no case where a division by zero can happen here.
The SMT Solver Z3 supports precise IEEE floating point arithmetic. Let's ask Z3 to find numbers a
and b
such that a != b && (a - b) == 0
:
(set-info :status unknown)
(set-logic QF_FP)
(declare-fun b () (FloatingPoint 8 24))
(declare-fun a () (FloatingPoint 8 24))
(declare-fun rm () RoundingMode)
(assert
(and (not (fp.eq a b)) (fp.eq (fp.sub rm a b) +zero) true))
(check-sat)
The result is UNSAT
. There are no such numbers.
The above SMTLIB string also allows Z3 to pick an arbitrary rounding mode (rm
). This means that the result holds for all possible rounding modes (of which there are five). The result also includes the possibility that any of the variables in play might be NaN
or infinity.
a == b
is implemented as fp.eq
quality so that +0f
and -0f
compare equal. The comparison with zero is implemented using fp.eq
as well. Since the question is aimed at avoiding a division by zero this is the appropriate comparison.
If the equality test was implemented using bitwise equality, +0f
and -0f
would have been a way to make a - b
zero. An incorrect previous version of this answer contains mode details about that case for the curious.
Z3 Online does not yet support the FPA theory. This result was obtained using the latest unstable branch. It can be reproduced using the .NET bindings as follows:
var fpSort = context.MkFPSort32();
var aExpr = (FPExpr)context.MkConst("a", fpSort);
var bExpr = (FPExpr)context.MkConst("b", fpSort);
var rmExpr = (FPRMExpr)context.MkConst("rm", context.MkFPRoundingModeSort());
var fpZero = context.MkFP(0f, fpSort);
var subExpr = context.MkFPSub(rmExpr, aExpr, bExpr);
var constraintExpr = context.MkAnd(
context.MkNot(context.MkFPEq(aExpr, bExpr)),
context.MkFPEq(subExpr, fpZero),
context.MkTrue()
);
var smtlibString = context.BenchmarkToSMTString(null, "QF_FP", null, null, new BoolExpr[0], constraintExpr);
var solver = context.MkSimpleSolver();
solver.Assert(constraintExpr);
var status = solver.Check();
Console.WriteLine(status);
Using Z3 to answer IEEE float questions is nice because it is hard to overlook cases (such as NaN
, -0f
, +-inf
) and you can ask arbitrary questions. No need to interpret and cite specifications. You can even ask mixed float and integer questions such as "is this particular int log2(float)
algorithm correct?".