What's the mathematical reason behind Python choosing to round integer division toward negative infinity?

I know Python // rounds towards negative infinity and in C++ / is truncating, rounding towards 0.

And here's what I know so far:

               |remainder|
-12 / 10  = -1,   - 2      // C++
-12 // 10 = -2,   + 8      # Python

12 / -10  = -1,     2      // C++
12 // -10 = -2,   - 8      # Python

12 / 10  = 1,      2       // Both
12 // 10 = 1,      2

-12 / -10 = 1,    - 2      // Both
          = 2,    + 8

C++:
1. m%(-n) == m%n
2. -m%n == -(m%n)
3. (m/n)*n + m%n == m

Python:
1. m%(-n) == -8 == -(-m%n)
2. (m//n)*n + m%n == m

But why Python // choose to round towards negative infinity? I didn't find any resources explain that, but only find and hear people say vaguely: "for mathematics reasons".

For example, in Why is -1/2 evaluated to 0 in C++, but -1 in Python?:

People dealing with these things in the abstract tend to feel that rounding toward negative infinity makes more sense (that means it's compatible with the modulo function as defined in mathematics, rather than % having a somewhat funny meaning).

But I don't see C++ 's / not being compatible with the modulo function. In C++, (m/n)*n + m%n == m also applies.

So what's the (mathematical) reason behind Python choosing rounding towards negative infinity?


See also Guido van Rossum's old blog post on the topic.


Solution 1:

But why Python // choose to round towards negative infinity?

I'm not sure if the reason why this choice was originally made is documented anywhere (although, for all I know, it could be explained in great length in some PEP somewhere), but we can certainly come up with various reasons why it makes sense.

One reason is simply that rounding towards negative (or positive!) infinity means that all numbers get rounded the same way, whereas rounding towards zero makes zero special. The mathematical way of saying this is that rounding down towards −∞ is translation invariant, i.e. it satisfies the equation:

round_down(x + k) == round_down(x) + k

for all real numbers x and all integers k. Rounding towards zero does not, since, for example:

round_to_zero(0.5 - 1) != round_to_zero(0.5) - 1

Of course, other arguments exist too, such as the argument you quote based on compatibility with (how we would like) the % operator (to behave) — more on that below.

Indeed, I would say the real question here is why Python's int() function is not defined to round floating point arguments towards negative infinity, so that m // n would equal int(m / n). (I suspect "historical reasons".) Then again, it's not that big of a deal, since Python does at least have math.floor() that does satisfy m // n == math.floor(m / n).


But I don't see C++ 's / not being compatible with the modulo function. In C++, (m/n)*n + m%n == m also applies.

True, but retaining that identity while having / round towards zero requires defining % in an awkward way for negative numbers. In particular, we lose both of the following useful mathematical properties of Python's %:

  1. 0 <= m % n < n for all m and all positive n; and
  2. (m + k * n) % n == m % n for all integers m, n and k.

These properties are useful because one of the main uses of % is "wrapping around" a number m to a limited range of length n.


For example, let's say we're trying to calculate directions: let's say heading is our current compass heading in degrees (counted clockwise from due north, with 0 <= heading < 360) and that we want to calculate our new heading after turning angle degrees (where angle > 0 if we turn clockwise, or angle < 0 if we turn counterclockwise). Using Python's % operator, we can calculate our new heading simply as:

heading = (heading + angle) % 360

and this will simply work in all cases.

However, if we try to to use this formula in C++, with its different rounding rules and correspondingly different % operator, we'll find that the wrap-around doesn't always work as expected! For example, if we start facing northwest (heading = 315) and turn 90° clockwise (angle = 90), we'll indeed end up facing northeast (heading = 45). But if then try to turn back 90° counterclockwise (angle = -90), with C++'s % operator we won't end up back at heading = 315 as expected, but instead at heading = -45!

To get the correct wrap-around behavior using the C++ % operator, we'll instead need to write the formula as something like:

heading = (heading + angle) % 360;
if (heading < 0) heading += 360;

or as:

heading = ((heading + angle) % 360) + 360) % 360;

(The simpler formula heading = (heading + angle + 360) % 360 will only work if we can always guarantee that heading + angle >= -360.)

This is the price you pay for having a non-translation-invariant rounding rule for division, and consequently a non-translation-invariant % operator.

Solution 2:

But why Python // choose to round towards negative infinity?

According to python-history.blogspot.com Guido van Rossum elected such behavior for // because

(...)there is a good mathematical reason. The integer division operation (//) and its sibling, the modulo operation (%), go together and satisfy a nice mathematical relationship (all variables are integers):

a/b = q with remainder r

such that

b*q + r = a and 0 <= r < b

(assuming a and b are >= 0).

If you want the relationship to extend for negative a (keeping b positive), you have two choices: if you truncate q towards zero, r will become negative, so that the invariant changes to 0 <= abs(r) < otherwise, you can floor q towards negative infinity, and the invariant remains 0 <= r < b(...) In mathematical number theory, mathematicians always prefer the latter choice (...). For Python, I made the same choice because there are some interesting applications of the modulo operation where the sign of a is uninteresting. Consider taking a POSIX timestamp (seconds since the start of 1970) and turning it into the time of day. Since there are 24*3600 = 86400 seconds in a day, this calculation is simply t % 86400. But if we were to express times before 1970 using negative numbers, the "truncate towards zero" rule would give a meaningless result! Using the floor rule it all works out fine. Other applications I've thought of are computations of pixel positions in computer graphics. I'm sure there are more.

So summing it up // behavior choice is due to keeping it consistent with % behavior, latter was selected due to its usefulness in working with negative (before start of 1970) timestamps and pixels.

Solution 3:

Both whole-number and real-number arithmetic define their division operators so that both of the following equivalences hold for all values of n and d.

(n+d)/d = (n/d)+1
(-n)/d = -(n/d)

Unfortunately, integer arithmetic cannot be defined in such a way that both hold. For many purposes, the first equivalence is more useful than the second, but in most situations where code would be dividing two values, one of the following would apply:

  1. Both values are positive, in which case the second equivalence is irrelevant.

  2. The dividend is a precise integer multiple of the divisor, in which case both equivalences can hold simultaneously.

Historically, the easiest way to handle division involving negative numbers was to observe whether exactly one operand was negative, drop the signs, perform the division, and then make the result negative if exactly one operand was. This would behave as required in both of the common situations, and would be cheaper than using an approach that would uphold the first equivalence in all cases, rather than only when the divisor was an exact multiple of the dividend.

Python shouldn't be viewed as using inferior semantics, but rather deciding that semantics which would generally be superior in cases that mattered would be worth making division slightly slower even in cases where the precise semantics wouldn't matter.