Norms in Python for floating point vs. Decimal (fixed-point)

Is it recommended to use Python's native floating point implementation, or its decimal implementation for use-cases where precision is important?

I thought this question would be easy to answer: if accumulated error has significant implications, e.g. perhaps in calculating orbital trajectories or the like, then an exact representation might make more sense.

I'm unsure for run of the mill deep learning use-cases, for scientific computing generally (e.g. many people use numpy or scikit-learn which i think use floating point implementations), and for financial computing (e.g. trading strategies) what the norms are.

Does anyone know the norms for floating point vs. Decimal use in python for these three areas?

  1. Finance (Trading Strategies)
  2. Deep Learning
  3. Scientific Computing

Thanks

N.B.: This is /not/ a question about the difference between floating point and fixed-point representations, or why floating point arithmetic produces surprising results. This is a question about what norms are.


I learn more about Deep Learning and Scientific Computing, but since my family is running the financing business, I think I can answer the question.

First and foremost, the float numbers are not evil; all you need to do is to understand how much precision does your project needs.

Finance

In the Financing area, depending on usage, you can use decimal or float number. Plus, different banks have different requirements. Generally, if you are dealing with cash or cash equivalent, you may use decimal since the fractional monetary unit is known. For example, for dollars, the fractional monetary unit is 0.01. So you can use decimal to store it, and in the database, you can just use number(20,2)(oracle) or similar things to store your decimal number. The precision is enough since banks have a systematic way to minimize errors on day one, even before the computers appear. The programmers only need to correctly implement what the bank's guideline says.

For other things in the financing area, like analysis and interest rate, using double is enough. Here the precision is not important, but the simplicity matters. CPUs are optimized to calculate float numbers, so no special methods are needed to calculate float arithmetic. Since arithmetic in computers is a huge topic, using an optimized and stabilized way to perform a calculation is much safer than to create its own methods to do arithmetic. Plus, one or two float calculations will not have a huge compact on the precision. For example, banks usually store the value in decimal and then perform multiplication with a float interest rate and then convert back to decimal. In this way, errors will not accumulate. Considering we only need two digits to the right of the decimal point, the float number's precision is quite enough to do such a computation.

I have heard that in investment banks, they use double in all of their systems since they deal with very large amounts of cash. Thus in these banks, simplicity and performance are more important than precision.

Deep Learning

Deep Learning is one of the fields that do not need high precision but do need high performance. A neural network can have millions of parameters, so the precision of a single weight and bias will not impact the prediction of the network. Instead, the neural network needs to compute very fast to train on a given dataset and give out a prediction in a reasonable time interval. Plus, many accelerators can actually accelerate a specific type of float: half-precision i.e., fp16. Thus, to reduce the size of the network in memory and to accelerate the train and prediction process, many neural networks usually run in hybrid mode. The neural network framework and accelerator driver can decide what parameters can be computed in fp16 with minimum overflow and underflow risk since fp16 has a pretty small range: 10^-8 to 65504. Other parameters are still computed in fp32. In some edge usage, the usable memory is very small (for example, K 210 and edge TPU only has 8MB onboard SRAM), so neural networks need to use 8-bit fixed-point numbers to fit in these devices. The fixed-point numbers are like decimals that they are the opposite of floating-point numbers as they have fixed digits after the decimal point. Usually, they represent themselves in the system as int8 or unit8.

Scientific Computation

The double type (i.e. 64-bit floating number) usually meets the scientist's need in scientific computation. In addition, IEEE 754 also has defined quad precision (128 bit) to facilitate scientific computation. Intel's x86 processors also have an 80-bit extended precision format.

However, some of the scientific computation needs arbitrary precision arithmetic. For example, to compute pi and to do astronomical simulation need high precision computation. Thus, they need something different, which is called arbitrary-precision floating-point number. One of the most famous libraries that support arbitrary-precision floating-point numbers is GNU Multiple Precision Arithmetic Library(GMP). They generally store the number directly across the memory and use stacks to simulate a vertical method to compute a final result.

In general, standard floating-point numbers are designed fairly well and elegantly. As long as you understand your need, floating-point numbers are capable for most usages.