How should I do floating point comparison?
Solution 1:
Comparing for greater/smaller is not really a problem unless you're working right at the edge of the float/double precision limit.
For a "fuzzy equals" comparison, this (Java code, should be easy to adapt) is what I came up with for The Floating-Point Guide after a lot of work and taking into account lots of criticism:
public static boolean nearlyEqual(float a, float b, float epsilon) {
final float absA = Math.abs(a);
final float absB = Math.abs(b);
final float diff = Math.abs(a - b);
if (a == b) { // shortcut, handles infinities
return true;
} else if (a == 0 || b == 0 || diff < Float.MIN_NORMAL) {
// a or b is zero or both are extremely close to it
// relative error is less meaningful here
return diff < (epsilon * Float.MIN_NORMAL);
} else { // use relative error
return diff / (absA + absB) < epsilon;
}
}
It comes with a test suite. You should immediately dismiss any solution that doesn't, because it is virtually guaranteed to fail in some edge cases like having one value 0, two very small values opposite of zero, or infinities.
An alternative (see link above for more details) is to convert the floats' bit patterns to integer and accept everything within a fixed integer distance.
In any case, there probably isn't any solution that is perfect for all applications. Ideally, you'd develop/adapt your own with a test suite covering your actual use cases.
Solution 2:
TL;DR
- Use the following function instead of the currently accepted solution to avoid some undesirable results in certain limit cases, while being potentially more efficient.
- Know the expected imprecision you have on your numbers and feed them accordingly in the comparison function.
bool nearly_equal(
float a, float b,
float epsilon = 128 * FLT_EPSILON, float abs_th = FLT_MIN)
// those defaults are arbitrary and could be removed
{
assert(std::numeric_limits<float>::epsilon() <= epsilon);
assert(epsilon < 1.f);
if (a == b) return true;
auto diff = std::abs(a-b);
auto norm = std::min((std::abs(a) + std::abs(b)), std::numeric_limits<float>::max());
// or even faster: std::min(std::abs(a + b), std::numeric_limits<float>::max());
// keeping this commented out until I update figures below
return diff < std::max(abs_th, epsilon * norm);
}
Graphics, please?
When comparing floating point numbers, there are two "modes".
The first one is the relative mode, where the difference between x
and y
is considered relatively to their amplitude |x| + |y|
. When plot in 2D, it gives the following profile, where green means equality of x
and y
. (I took an epsilon
of 0.5 for illustration purposes).
The relative mode is what is used for "normal" or "large enough" floating points values. (More on that later).
The second one is an absolute mode, when we simply compare their difference to a fixed number. It gives the following profile (again with an epsilon
of 0.5 and a abs_th
of 1 for illustration).
This absolute mode of comparison is what is used for "tiny" floating point values.
Now the question is, how do we stitch together those two response patterns.
In Michael Borgwardt's answer, the switch is based on the value of diff
, which should be below abs_th
(Float.MIN_NORMAL
in his answer). This switch zone is shown as hatched in the graph below.
Because abs_th * epsilon
is smaller that abs_th
, the green patches do not stick together, which in turn gives the solution a bad property: we can find triplets of numbers such that x < y_1 < y_2
and yet x == y2
but x != y1
.
Take this striking example:
x = 4.9303807e-32
y1 = 4.930381e-32
y2 = 4.9309825e-32
We have x < y1 < y2
, and in fact y2 - x
is more than 2000 times larger than y1 - x
. And yet with the current solution,
nearlyEqual(x, y1, 1e-4) == False
nearlyEqual(x, y2, 1e-4) == True
By contrast, in the solution proposed above, the switch zone is based on the value of |x| + |y|
, which is represented by the hatched square below. It ensures that both zones connects gracefully.
Also, the code above does not have branching, which could be more efficient. Consider that operations such as max
and abs
, which a priori needs branching, often have dedicated assembly instructions. For this reason, I think this approach is superior to another solution that would be to fix Michael's nearlyEqual
by changing the switch from diff < abs_th
to diff < eps * abs_th
, which would then produce essentially the same response pattern.
Where to switch between relative and absolute comparison?
The switch between those modes is made around abs_th
, which is taken as FLT_MIN
in the accepted answer. This choice means that the representation of float32
is what limits the precision of our floating point numbers.
This does not always make sense. For example, if the numbers you compare are the results of a subtraction, perhaps something in the range of FLT_EPSILON
makes more sense. If they are squared roots of subtracted numbers, the numerical imprecision could be even higher.
It is rather obvious when you consider comparing a floating point with 0
. Here, any relative comparison will fail, because |x - 0| / (|x| + 0) = 1
. So the comparison needs to switch to absolute mode when x
is on the order of the imprecision of your computation -- and rarely is it as low as FLT_MIN
.
This is the reason for the introduction of the abs_th
parameter above.
Also, by not multiplying abs_th
with epsilon
, the interpretation of this parameter is simple and correspond to the level of numerical precision that we expect on those numbers.
Mathematical rumbling
(kept here mostly for my own pleasure)
More generally I assume that a well-behaved floating point comparison operator =~
should have some basic properties.
The following are rather obvious:
- self-equality:
a =~ a
- symmetry:
a =~ b
impliesb =~ a
- invariance by opposition:
a =~ b
implies-a =~ -b
(We don't have a =~ b
and b =~ c
implies a =~ c
, =~
is not an equivalence relationship).
I would add the following properties that are more specific to floating point comparisons
- if
a < b < c
, thena =~ c
impliesa =~ b
(closer values should also be equal) - if
a, b, m >= 0
thena =~ b
impliesa + m =~ b + m
(larger values with the same difference should also be equal) - if
0 <= λ < 1
thena =~ b
impliesλa =~ λb
(perhaps less obvious to argument for).
Those properties already give strong constrains on possible near-equality functions. The function proposed above verifies them. Perhaps one or several otherwise obvious properties are missing.
When one think of =~
as a family of equality relationship =~[Ɛ,t]
parameterized by Ɛ
and abs_th
, one could also add
- if
Ɛ1 < Ɛ2
thena =~[Ɛ1,t] b
impliesa =~[Ɛ2,t] b
(equality for a given tolerance implies equality at a higher tolerance) - if
t1 < t2
thena =~[Ɛ,t1] b
impliesa =~[Ɛ,t2] b
(equality for a given imprecision implies equality at a higher imprecision)
The proposed solution also verifies these.