Why does += behave unexpectedly on lists?
The +=
operator in python seems to be operating unexpectedly on lists. Can anyone tell me what is going on here?
class foo:
bar = []
def __init__(self,x):
self.bar += [x]
class foo2:
bar = []
def __init__(self,x):
self.bar = self.bar + [x]
f = foo(1)
g = foo(2)
print f.bar
print g.bar
f.bar += [3]
print f.bar
print g.bar
f.bar = f.bar + [4]
print f.bar
print g.bar
f = foo2(1)
g = foo2(2)
print f.bar
print g.bar
OUTPUT
[1, 2]
[1, 2]
[1, 2, 3]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3]
[1]
[2]
foo += bar
seems to affect every instance of the class, whereas foo = foo + bar
seems to behave in the way I would expect things to behave.
The +=
operator is called a "compound assignment operator".
The general answer is that +=
tries to call the __iadd__
special method, and if that isn't available it tries to use __add__
instead. So the issue is with the difference between these special methods.
The __iadd__
special method is for an in-place addition, that is it mutates the object that it acts on. The __add__
special method returns a new object and is also used for the standard +
operator.
So when the +=
operator is used on an object which has an __iadd__
defined the object is modified in place. Otherwise it will instead try to use the plain __add__
and return a new object.
That is why for mutable types like lists +=
changes the object's value, whereas for immutable types like tuples, strings and integers a new object is returned instead (a += b
becomes equivalent to a = a + b
).
For types that support both __iadd__
and __add__
you therefore have to be careful which one you use. a += b
will call __iadd__
and mutate a
, whereas a = a + b
will create a new object and assign it to a
. They are not the same operation!
>>> a1 = a2 = [1, 2]
>>> b1 = b2 = [1, 2]
>>> a1 += [3] # Uses __iadd__, modifies a1 in-place
>>> b1 = b1 + [3] # Uses __add__, creates new list, assigns it to b1
>>> a2
[1, 2, 3] # a1 and a2 are still the same list
>>> b2
[1, 2] # whereas only b1 was changed
For immutable types (where you don't have an __iadd__
) a += b
and a = a + b
are equivalent. This is what lets you use +=
on immutable types, which might seem a strange design decision until you consider that otherwise you couldn't use +=
on immutable types like numbers!
For the general case, see Scott Griffith's answer. When dealing with lists like you are, though, the +=
operator is a shorthand for someListObject.extend(iterableObject)
. See the documentation of extend().
The extend
function will append all elements of the parameter to the list.
When doing foo += something
you're modifying the list foo
in place, thus you don't change the reference that the name foo
points to, but you're changing the list object directly. With foo = foo + something
, you're actually creating a new list.
This example code will explain it:
>>> l = []
>>> id(l)
13043192
>>> l += [3]
>>> id(l)
13043192
>>> l = l + [3]
>>> id(l)
13059216
Note how the reference changes when you reassign the new list to l
.
As bar
is a class variable instead of an instance variable, modifying in place will affect all instances of that class. But when redefining self.bar
, the instance will have a separate instance variable self.bar
without affecting the other class instances.
The problem here is, bar
is defined as a class attribute, not an instance variable.
In foo
, the class attribute is modified in the init
method, that's why all instances are affected.
In foo2
, an instance variable is defined using the (empty) class attribute, and every instance gets its own bar
.
The "correct" implementation would be:
class foo:
def __init__(self, x):
self.bar = [x]
Of course, class attributes are completely legal. In fact, you can access and modify them without creating an instance of the class like this:
class foo:
bar = []
foo.bar = [x]
There are two things involved here:
1. class attributes and instance attributes
2. difference between the operators + and += for lists
+
operator calls the __add__
method on a list. It takes all the elements from its operands and makes a new list containing those elements maintaining their order.
+=
operator calls __iadd__
method on the list. It takes an iterable and appends all the elements of the iterable to the list in place. It does not create a new list object.
In class foo
the statement self.bar += [x]
is not an assignment statement but actually translates to
self.bar.__iadd__([x]) # modifies the class attribute
which modifies the list in place and acts like the list method extend
.
In class foo2
, on the contrary, the assignment statement in the init
method
self.bar = self.bar + [x]
can be deconstructed as:
The instance has no attribute bar
(there is a class attribute of the same name, though) so it accesses the class attribute bar
and creates a new list by appending x
to it. The statement translates to:
self.bar = self.bar.__add__([x]) # bar on the lhs is the class attribute
Then it creates an instance attribute bar
and assigns the newly created list to it. Note that bar
on the rhs of the assignment is different from the bar
on the lhs.
For instances of class foo
, bar
is a class attribute and not instance attribute. Hence any change to the class attribute bar
will be reflected for all instances.
On the contrary, each instance of the class foo2
has its own instance attribute bar
which is different from the class attribute of the same name bar
.
f = foo2(4)
print f.bar # accessing the instance attribute. prints [4]
print f.__class__.bar # accessing the class attribute. prints []
Hope this clears things.