Why does python/numpy's += mutate the original array?

import numpy as np

W = np.array([0,1,2])
W1 = W
W1 += np.array([2,3,4])
print W

W = np.array([0,1,2])
W1 = W
W1 = W1 + np.array([2,3,4])
print W

The upper code will mutate W, but the lower code will not mutate W. Why?


Solution 1:

This is true for almost any type of collection. This is simply due to the way python treats variables. var1 += var2 is not the same as var1 = var1 + var2 with collections. I'll explain it as far as I understand it, which can certainly be improved, so any edits/criticisms are welcomed.

print("1:")
x1 = [7]
y1 = x1
y1 += [3]
print("{} {}".format(x1, id(x1)))
print("{} {}".format(y1, id(y1)))

print("2:")
x2 = [7]
y2 = x2
y2 = y2 + [3]
print("{} {}".format(x2, id(x2)))
print("{} {}".format(y2, id(y2)))

Output:

1:
[7, 3] 40229784 # first id
[7, 3] 40229784 # same id
2:
[7]    40228744 # first id
[7, 3] 40230144 # new id

Saying var1 = var1 + var2 creates a new object with a new ID. It takes the old value, adds it to the 2nd variable, and assigns it to a new object with the NAME of the first object. In the var1 += var2 example, it simply appends it to the object pointed at by the ID, which is the same as the old variable.

Solution 2:

In the case of

W = np.array([0,1,2])
W1 = W
W1 += np.array([2,3,4])

W points to some location in memory, holding a numpy array. W1 points to the same location. W1 += np.array([2,3,4]) takes that location in memory, and changes the contents.

In this case:

W = np.array([0,1,2])
W1 = W
W1 = W1 + np.array([2,3,4])

W and W1 start out pointing to the same location in memory. You then create a new array (W1 + np.array([2,3,4])) which is in a new location in memory. (Keep in mind: the right hand side is always evaluated first, and only then is it assigned to the variable on the left hand side.) Then, you make W1 point to this new location in memory (by assigning W1 to this new array). W still points to the old location in memory. From this point on, W and W1 are no longer the same array.