How do numpy's in-place operations (e.g. `+=`) work?

Solution 1:

The first thing you need to realise is that a += x doesn't map exactly to a.__iadd__(x), instead it maps to a = a.__iadd__(x). Notice that the documentation specifically says that in-place operators return their result, and this doesn't have to be self (although in practice, it usually is). This means a[i] += x trivially maps to:

a.__setitem__(i, a.__getitem__(i).__iadd__(x))

So, the addition technically happens in-place, but only on a temporary object. There is still potentially one less temporary object created than if it called __add__, though.

Solution 2:

Actually that has nothing to do with numpy. There is no "set/getitem in-place" in python, these things are equivalent to a[indices] = a[indices] + x. Knowing that, it becomes pretty obvious what is going on. (EDIT: As lvc writes, actually the right hand side is in place, so that it is a[indices] = (a[indices] += x) if that was legal syntax, that has largly the same effect though)

Of course a += x actually is in-place, by mapping a to the np.add out argument.

It has been discussed before and numpy cannot do anything about it as such. Though there is an idea to have a np.add.at(array, index_expression, x) to at least allow such operations.

Solution 3:

I don't know what's going on under the hood, but in-place operations on items in NumPy arrays and in Python lists will return the same reference, which IMO can lead to confusing results when passed into a function.

Start with Python

>>> a = [1, 2, 3]
>>> b = a
>>> a is b
True
>>> id(a[2])
12345
>>> id(b[2])
12345

... where 12345 is a unique id for the location of the value at a[2] in memory, which is the same as b[2].

So a and b refer to the same list in memory. Now try in-place addition on an item in the list.

>>> a[2] += 4
>>> a
[1, 2, 7]
>>> b
[1, 2, 7]
>>> a is b
True
>>> id(a[2])
67890
>>> id(b[2])
67890

So in-place addition of the item in the list only changed the value of the item at index 2, but a and b still reference the same list, although the 3rd item in the list was reassigned to a new value, 7. The reassignment explains why if a = 4 and b = a were integers (or floats) instead of lists, then a += 1 would cause a to be reassigned, and then b and a would be different references. However, if list addition is called, eg: a += [5] for a and b referencing the same list, it does not reassign a; they will both be appended.

Now for NumPy

>>> import numpy as np
>>> a = np.array([1, 2, 3], float)
>>> b = a
>>> a is b
True

Again these are the same reference, and in-place operators seem have the same effect as for list in Python:

>>> a += 4
>>> a
array([ 5.,  6.,  7.])
>>> b
array([ 5.,  6.,  7.])

In place addition of an ndarray updates the reference. This is not the same as calling numpy.add which creates a copy in a new reference.

>>> a = a + 4
>>> a
array([  9.,  10.,  11.])
>>> b
array([ 5.,  6.,  7.])

In-place operations on borrowed references

I think the danger here is if the reference is passed to a different scope.

>>> def f(x):
...     x += 4
...     return x

The argument reference to x is passed into the scope of f which does not make a copy and in fact changes the value at that reference and passes it back.

>>> f(a)
array([ 13.,  14.,  15.])
>>> f(a)
array([ 17.,  18.,  19.])
>>> f(a)
array([ 21.,  22.,  23.])
>>> f(a)
array([ 25.,  26.,  27.])

The same would be true for a Python list as well:

>>> def f(x, y):
...     x += [y]

>>> a = [1, 2, 3]
>>> b = a
>>> f(a, 5)
>>> a
[1, 2, 3, 5]
>>> b
[1, 2, 3, 5]

IMO this can be confusing and sometimes difficult to debug, so I try to only use in-place operators on references that belong to the current scope, and I try be careful of borrowed references.

Solution 4:

As Ivc explains, there is no in-place item add method, so under the hood it uses __getitem__, then __iadd__, then __setitem__. Here's a way to empirically observe that behavior:

import numpy

class A(numpy.ndarray):
    def __getitem__(self, *args, **kwargs):
        print("getitem")
        return numpy.ndarray.__getitem__(self, *args, **kwargs)
    def __setitem__(self, *args, **kwargs):
        print("setitem")
        return numpy.ndarray.__setitem__(self, *args, **kwargs)
    def __iadd__(self, *args, **kwargs):
        print("iadd")
        return numpy.ndarray.__iadd__(self, *args, **kwargs)

a = A([1,2,3])
print("about to increment a[0]")
a[0] += 1

It prints

about to increment a[0]
getitem
iadd
setitem