How to avoid floating point errors? [duplicate]

I was trying to write a function to approximate square roots (I know there's the math module...I want to do it myself), and I was getting screwed over by the floating point arithmetic. How can you avoid that?

def sqrt(num):
    root = 0.0
    while root * root < num:
        root += 0.01
    return root

Using this has these results:

>>> sqrt(4)
2.0000000000000013
>>> sqrt(9)
3.00999999999998

I realize I could just use round(), but I want to be able to make this really accurate. I want to be able to calculate out to 6 or 7 digits. That wouldn't be possible if I'm rounding. I want to understand how to properly handle floating point calculations in Python.


This really has nothing to do with Python - you'd see the same behavior in any language using your hardware's binary floating-point arithmetic. First read the docs.

After you read that, you'll better understand that you're not adding one one-hundredth in your code. This is exactly what you're adding:

>>> from decimal import Decimal
>>> Decimal(.01)
Decimal('0.01000000000000000020816681711721685132943093776702880859375')

That string shows the exact decimal value of the binary floating ("double precision" in C) approximation to the exact decimal value 0.01. The thing you're really adding is a little bigger than 1/100.

Controlling floating-point numeric errors is the field called "numerical analysis", and is a very large and complex topic. So long as you're startled by the fact that floats are just approximations to decimal values, use the decimal module. That will take away a world of "shallow" problems for you. For example, given this small modification to your function:

from decimal import Decimal as D

def sqrt(num):
    root = D(0)
    while root * root < num:
        root += D("0.01")
    return root

then:

>>> sqrt(4)
Decimal('2.00')
>>> sqrt(9)
Decimal('3.00')

It's not really more accurate, but may be less surprising in simple examples because now it's adding exactly one one-hundredth.

An alternative is to stick to floats and add something that is exactly representable as a binary float: values of the form I/2**J. For example, instead of adding 0.01, add 0.125 (1/8) or 0.0625 (1/16).

Then look up "Newton's method" for computing square roots ;-)


I mean, there are modules such as decimal and fractions. But I made a class that was for problems like these. This class only solves addition, subtraction, multiplication, floor division, division, and modulus. But it is easily extendable. It basically converts the floats into a list ([the float, the power of ten to multiply the float by to get an integer]) and does arithmetic from there. Integers are more accurate than floats in python. That's what this class takes advantage of. So, without further ado, here's the code:

class decimal():
    # TODO: # OPTIMISE: code to maximize performance
    """
    Class decimal, a more reliable alternative to float. | v0.1
    ============================================================
            Python's floats (and in many other languages as well) are
    pretty inaccurate. While on the outside it may look like this:

    .1 + .1 + .1

            But on the inside, it gets converted to base 2. It tells
    the computer, "2 to the power of what is 0.1?". The
    computer says, "Oh, I don't know; would an approximation
    be sufficient?"
    Python be like, "Oh, sure, why not? It's not like we need to
    give it that much accuracy."
            And so that happens. But what they ARE good at is
    everything else, including multiplying a float and a
    10 together. So I abused that and made this: the decimal
    class. Us humans knows that 1 + 1 + 1 = 3. Well, most of us
    anyway but that's not important. The thing is, computers can
    too! This new replacement does the following:

            1. Find how many 10 ^ n it takes to get the number inputted
                    into a valid integer.
            2. Make a list with the original float and n (multiplying the by
                    10^-n is inaccurate)

            And that's pretty much it, if you don't count the
    adding, subtracting, etc algorithm. This is more accurate than just
    ".1 + .1 + .1". But then, it's more simple than hand-typing
    (.1 * 100 + .01 * 100 + .1 * 100)/100
    (which is basically the algorithm for this). But it does have it's costs.
    --------------------------------------------------------------------------

    BAD #1: It's slightly slower then the conventional .1 + .1 + .1 but
        it DOES make up for accuracy

    BAD #2: It's useless, there are many libraries out there that solves the
            same problem as this. They may be more or less efficient than this
            method. Thus rendering this useless.
    --------------------------------------------------------------------------
    And that's pretty much it! Thanks for stopping by to read this doc-string.
    --------------------------------------------------------------------------
        Copyright © 2020 Bryan Hu

        Permission is hereby granted, free of charge, to any person obtaining
        a copy of this software and associated documentation files
        (the "Software"), to deal in the Software without restriction,
        including without limitation the rights to use, copy, modify,
        merge, publish, distribute, sub-license, and/or sell copies of
        the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:

        The above copyright notice and this permission notice shall be included
        in all copies or substantial portions of the Software.

        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
        OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
        MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
        IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
        CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
        TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
        SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
    """

    def __init__(self, number):
        super(decimal, self).__init__()
        if number is iter:
            processed = float(number[0])
        else:
            processed = float(number)
        x = 10
        while round(processed * x) != processed * x:
            x *= 10
        self.number = [processed, x]

    def __add__(self, other):
        the_other_number, num = list(other), list(self.number)
        try:
            maximum = max(
                float(num[1]), float(the_other_number[1]))
            return decimal(
                (num[0] * maximum + the_other_number[0] * maximum) / maximum)
        except IndexError:
            raise "Entered {}, which has the type {},\
             is not a valid type".format(
                other, type(other))

    def __float__(self):
        return float(self.number[0])

    def __bool__(self):
        return bool(self.number[0])

    def __str__(self):
        return str(self.number)

    def __iter__(self):
        return (x for x in self.number)

    def __repr__(self):
        return str(self.number[0])

    def __sub__(self, other):
        the_other_number, num = list(other), list(self.number)
        try:
            maximum = max(
                float(num[1]), float(the_other_number[1]))
            return decimal(
                (num[0] * maximum - the_other_number[0] * maximum) / maximum)
        except IndexError:
            raise "Entered {}, which has the type {},\
         is not a valid type".format(
                other, type(other))

    def __div__(self, other):
        the_other_number, num = list(other), list(self.number)
        try:
            maximum = max(
                float(num[1]), float(the_other_number[1]))
            return decimal(
                ((num[0] * maximum) / (
                    the_other_number[0] * maximum)) / maximum)
        except IndexError:
            raise "Entered {}, which has the type {},\
         is not a valid type".format(
                other, type(other))

    def __floordiv__(self, other):
        the_other_number, num = list(other), list(self.number)
        try:
            maximum = max(
                float(num[1]), float(the_other_number[1]))
            return decimal(
                ((num[0] * maximum) // (
                    the_other_number[0] * maximum)) / maximum)
        except IndexError:
            raise "Entered {}, which has the type {},\
         is not a valid type".format(
                other, type(other))

    def __mul__(self, other):
        the_other_number, num = list(other), list(self.number)
        try:
            maximum = max(
                float(num[1]), float(the_other_number[1]))
            return decimal(
                ((num[0] * maximum) * (
                    the_other_number[0] * maximum)) / maximum)
        except IndexError:
            raise "Entered {}, which has the type {},\
         is not a valid type".format(
                other, type(other))

    def __mod__(self, other):
        the_other_number, num = list(other), list(self.number)
        try:
            maximum = max(
                float(num[1]), float(the_other_number[1]))
            return decimal(
                ((num[0] * maximum) % (
                    the_other_number[0] * maximum)) / maximum)
        except IndexError:
            raise "Entered {}, which has the type {},\
         is not a valid type".format(
                other, type(other))
    # Pastebin: https://pastebin.com/MwzZ1W9e