Safest way to convert float to integer in python?

Solution 1:

All integers that can be represented by floating point numbers have an exact representation. So you can safely use int on the result. Inexact representations occur only if you are trying to represent a rational number with a denominator that is not a power of two.

That this works is not trivial at all! It's a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.

To quote from Wikipedia,

Any integer with absolute value less than or equal to 2²⁴ can be exactly represented in the single precision format, and any integer with absolute value less than or equal to 2⁵³ can be exactly represented in the double precision format.

Solution 2:

Use int(your non integer number) will nail it.

print int(2.3) # "2"
print int(math.sqrt(5)) # "2"

Solution 3:

You could use the round function. If you use no second parameter (# of significant digits) then I think you will get the behavior you want.

IDLE output.

>>> round(2.99999999999)
3
>>> round(2.6)
3
>>> round(2.5)
3
>>> round(2.4)
2

Solution 4:

Combining two of the previous results, we have:

int(round(some_float))

This converts a float to an integer fairly dependably.

Solution 5:

That this works is not trivial at all! It's a property of the IEEE floating point representation that int∘floor = ⌊⋅⌋ if the magnitude of the numbers in question is small enough, but different representations are possible where int(floor(2.3)) might be 1.

This post explains why it works in that range.

In a double, you can represent 32bit integers without any problems. There cannot be any rounding issues. More precisely, doubles can represent all integers between and including 2⁵³ and -2⁵³.

Short explanation: A double can store up to 53 binary digits. When you require more, the number is padded with zeroes on the right.

It follows that 53 ones is the largest number that can be stored without padding. Naturally, all (integer) numbers requiring less digits can be stored accurately.

Adding one to 111(omitted)111 (53 ones) yields 100...000, (53 zeroes). As we know, we can store 53 digits, that makes the rightmost zero padding.

This is where 2⁵³ comes from.

More detail: We need to consider how IEEE-754 floating point works.

  1 bit    11 / 8     52 / 23      # bits double/single precision
[ sign |  exponent | mantissa ]

The number is then calculated as follows (excluding special cases that are irrelevant here):

-1^sign × 1.mantissa ×2^{exponent - bias}

where bias = 2^{exponent - 1} - 1, i.e. 1023 and 127 for double/single precision respectively.

Knowing that multiplying by 2^X simply shifts all bits X places to the left, it's easy to see that any integer must have all bits in the mantissa that end up right of the decimal point to zero.

Any integer except zero has the following form in binary:

1x...x where the x-es represent the bits to the right of the MSB (most significant bit).

Because we excluded zero, there will always be a MSB that is one—which is why it's not stored. To store the integer, we must bring it into the aforementioned form: -1^sign × 1.mantissa ×2^{exponent - bias}.

That's saying the same as shifting the bits over the decimal point until there's only the MSB towards the left of the MSB. All the bits right of the decimal point are then stored in the mantissa.

From this, we can see that we can store at most 52 binary digits apart from the MSB.

It follows that the highest number where all bits are explicitly stored is

111(omitted)111.   that's 53 ones (52 + implicit 1) in the case of doubles.

For this, we need to set the exponent, such that the decimal point will be shifted 52 places. If we were to increase the exponent by one, we cannot know the digit right to the left after the decimal point.

111(omitted)111x.

By convention, it's 0. Setting the entire mantissa to zero, we receive the following number:

100(omitted)00x. = 100(omitted)000.

That's a 1 followed by 53 zeroes, 52 stored and 1 added due to the exponent.

It represents 2⁵³, which marks the boundary (both negative and positive) between which we can accurately represent all integers. If we wanted to add one to 2⁵³, we would have to set the implicit zero (denoted by the x) to one, but that's impossible.