What causes [*a] to overallocate?
Apparently list(a)
doesn't overallocate, [x for x in a]
overallocates at some points, and [*a]
overallocates all the time?
Here are sizes n from 0 to 12 and the resulting sizes in bytes for the three methods:
0 56 56 56
1 64 88 88
2 72 88 96
3 80 88 104
4 88 88 112
5 96 120 120
6 104 120 128
7 112 120 136
8 120 120 152
9 128 184 184
10 136 184 192
11 144 184 200
12 152 184 208
Computed like this, reproducable at repl.it, using Python 3.8:
from sys import getsizeof
for n in range(13):
a = [None] * n
print(n, getsizeof(list(a)),
getsizeof([x for x in a]),
getsizeof([*a]))
So: How does this work? How does [*a]
overallocate? Actually, what mechanism does it use to create the result list from the given input? Does it use an iterator over a
and use something like list.append
? Where is the source code?
(Colab with data and code that produced the images.)
Zooming in to smaller n:
Zooming out to larger n:
[*a]
is internally doing the C equivalent of:
- Make a new, empty
list
- Call
newlist.extend(a)
- Returns
list
.
So if you expand your test to:
from sys import getsizeof
for n in range(13):
a = [None] * n
l = []
l.extend(a)
print(n, getsizeof(list(a)),
getsizeof([x for x in a]),
getsizeof([*a]),
getsizeof(l))
Try it online!
you'll see the results for getsizeof([*a])
and l = []; l.extend(a); getsizeof(l)
are the same.
This is usually the right thing to do; when extend
ing you're usually expecting to add more later, and similarly for generalized unpacking, it's assumed that multiple things will be added one after the other. [*a]
is not the normal case; Python assumes there are multiple items or iterables being added to the list
([*a, b, c, *d]
), so overallocation saves work in the common case.
By contrast, a list
constructed from a single, presized iterable (with list()
) may not grow or shrink during use, and overallocating is premature until proven otherwise; Python recently fixed a bug that made the constructor overallocate even for inputs with known size.
As for list
comprehensions, they're effectively equivalent to repeated append
s, so you're seeing the final result of the normal overallocation growth pattern when adding an element at a time.
To be clear, none of this is a language guarantee. It's just how CPython implements it. The Python language spec is generally unconcerned with specific growth patterns in list
(aside from guaranteeing amortized O(1)
append
s and pop
s from the end). As noted in the comments, the specific implementation changes again in 3.9; while it won't affect [*a]
, it could affect other cases where what used to be "build a temporary tuple
of individual items and then extend
with the tuple
" now becomes multiple applications of LIST_APPEND
, which can change when the overallocation occurs and what numbers go into the calculation.
Full picture of what happens, building on the other answers and comments (especially ShadowRanger's answer, which also explains why it's done like that).
Disassembling shows that BUILD_LIST_UNPACK
gets used:
>>> import dis
>>> dis.dis('[*a]')
1 0 LOAD_NAME 0 (a)
2 BUILD_LIST_UNPACK 1
4 RETURN_VALUE
That's handled in ceval.c
, which builds an empty list and extends it (with a
):
case TARGET(BUILD_LIST_UNPACK): {
...
PyObject *sum = PyList_New(0);
...
none_val = _PyList_Extend((PyListObject *)sum, PEEK(i));
_PyList_Extend
uses list_extend
:
_PyList_Extend(PyListObject *self, PyObject *iterable)
{
return list_extend(self, iterable);
}
Which calls list_resize
with the sum of the sizes:
list_extend(PyListObject *self, PyObject *iterable)
...
n = PySequence_Fast_GET_SIZE(iterable);
...
m = Py_SIZE(self);
...
if (list_resize(self, m + n) < 0) {
And that overallocates as follows:
list_resize(PyListObject *self, Py_ssize_t newsize)
{
...
new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);
Let's check that. Compute the expected number of spots with the formula above, and compute the expected byte size by multiplying it with 8 (as I'm using 64-bit Python here) and adding an empty list's byte size (i.e., a list object's constant overhead):
from sys import getsizeof
for n in range(13):
a = [None] * n
expected_spots = n + (n >> 3) + (3 if n < 9 else 6)
expected_bytesize = getsizeof([]) + expected_spots * 8
real_bytesize = getsizeof([*a])
print(n,
expected_bytesize,
real_bytesize,
real_bytesize == expected_bytesize)
Output:
0 80 56 False
1 88 88 True
2 96 96 True
3 104 104 True
4 112 112 True
5 120 120 True
6 128 128 True
7 136 136 True
8 152 152 True
9 184 184 True
10 192 192 True
11 200 200 True
12 208 208 True
Matches except for n = 0
, which list_extend
actually shortcuts, so actually that matches, too:
if (n == 0) {
...
Py_RETURN_NONE;
}
...
if (list_resize(self, m + n) < 0) {