Are numpy arrays passed by reference?
I came across the fact that numpy
arrays are passed by reference at multiple places, but then when I execute the following code, why is there a difference between the behavior of foo
and bar
import numpy as np
def foo(arr):
arr = arr - 3
def bar(arr):
arr -= 3
a = np.array([3, 4, 5])
foo(a)
print a # prints [3, 4, 5]
bar(a)
print a # prints [0, 1, 2]
I'm using python 2.7 and numpy version 1.6.1
In Python, all variable names are references to values.
When Python evaluates an assignment, the right-hand side is evaluated before the left-hand side. arr - 3
creates a new array; it does not modify arr
in-place.
arr = arr - 3
makes the local variable arr
reference this new array. It does not modify the value originally referenced by arr
which was passed to foo
. The variable name arr
simply gets bound to the new array, arr - 3
. Moreover, arr
is local variable name in the scope of the foo
function. Once the foo
function completes, there is no more reference to arr
and Python is free to garbage collect the value it references. As Reti43 points out, in order for arr
's value to affect a
, foo
must return arr
and a
must be assigned to that value:
def foo(arr):
arr = arr - 3
return arr
# or simply combine both lines into `return arr - 3`
a = foo(a)
In contrast, arr -= 3
, which Python translates into a call to the __iadd__
special method, does modify the array referenced by arr
in-place.
The first function calculates (arr - 3)
, then assigns the local name arr
to it, which doesn't affect the array data passed in. My guess is that in the second function, np.array
overrides the -=
operator, and operates in place on the array data.