Understanding dict.copy() - shallow or deep?
While reading up the documentation for dict.copy()
, it says that it makes a shallow copy of the dictionary. Same goes for the book I am following (Beazley's Python Reference), which says:
The m.copy() method makes a shallow copy of the items contained in a mapping object and places them in a new mapping object.
Consider this:
>>> original = dict(a=1, b=2)
>>> new = original.copy()
>>> new.update({'c': 3})
>>> original
{'a': 1, 'b': 2}
>>> new
{'a': 1, 'c': 3, 'b': 2}
So I assumed this would update the value of original
(and add 'c': 3) also since I was doing a shallow copy. Like if you do it for a list:
>>> original = [1, 2, 3]
>>> new = original
>>> new.append(4)
>>> new, original
([1, 2, 3, 4], [1, 2, 3, 4])
This works as expected.
Since both are shallow copies, why is that the dict.copy()
doesn't work as I expect it to? Or my understanding of shallow vs deep copying is flawed?
By "shallow copying" it means the content of the dictionary is not copied by value, but just creating a new reference.
>>> a = {1: [1,2,3]}
>>> b = a.copy()
>>> a, b
({1: [1, 2, 3]}, {1: [1, 2, 3]})
>>> a[1].append(4)
>>> a, b
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
In contrast, a deep copy will copy all contents by value.
>>> import copy
>>> c = copy.deepcopy(a)
>>> a, c
({1: [1, 2, 3, 4]}, {1: [1, 2, 3, 4]})
>>> a[1].append(5)
>>> a, c
({1: [1, 2, 3, 4, 5]}, {1: [1, 2, 3, 4]})
So:
-
b = a
: Reference assignment, Makea
andb
points to the same object. -
b = a.copy()
: Shallow copying,a
andb
will become two isolated objects, but their contents still share the same reference -
b = copy.deepcopy(a)
: Deep copying,a
andb
's structure and content become completely isolated.
Take this example:
original = dict(a=1, b=2, c=dict(d=4, e=5))
new = original.copy()
Now let's change a value in the 'shallow' (first) level:
new['a'] = 10
# new = {'a': 10, 'b': 2, 'c': {'d': 4, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 4, 'e': 5}}
# no change in original, since ['a'] is an immutable integer
Now let's change a value one level deeper:
new['c']['d'] = 40
# new = {'a': 10, 'b': 2, 'c': {'d': 40, 'e': 5}}
# original = {'a': 1, 'b': 2, 'c': {'d': 40, 'e': 5}}
# new['c'] points to the same original['d'] mutable dictionary, so it will be changed
It's not a matter of deep copy or shallow copy, none of what you're doing is deep copy.
Here:
>>> new = original
you're creating a new reference to the the list/dict referenced by original.
while here:
>>> new = original.copy()
>>> # or
>>> new = list(original) # dict(original)
you're creating a new list/dict which is filled with a copy of the references of objects contained in the original container.