Why is [] faster than list()?

Solution 1:

Because [] and {} are literal syntax. Python can create bytecode just to create the list or dictionary objects:

>>> import dis
>>> dis.dis(compile('[]', '', 'eval'))
  1           0 BUILD_LIST               0
              3 RETURN_VALUE        
>>> dis.dis(compile('{}', '', 'eval'))
  1           0 BUILD_MAP                0
              3 RETURN_VALUE

list() and dict() are separate objects. Their names need to be resolved, the stack has to be involved to push the arguments, the frame has to be stored to retrieve later, and a call has to be made. That all takes more time.

For the empty case, that means you have at the very least a LOAD_NAME (which has to search through the global namespace as well as the builtins module) followed by a CALL_FUNCTION, which has to preserve the current frame:

>>> dis.dis(compile('list()', '', 'eval'))
  1           0 LOAD_NAME                0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
>>> dis.dis(compile('dict()', '', 'eval'))
  1           0 LOAD_NAME                0 (dict)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE

You can time the name lookup separately with timeit:

>>> import timeit
>>> timeit.timeit('list', number=10**7)
0.30749011039733887
>>> timeit.timeit('dict', number=10**7)
0.4215109348297119

The time discrepancy there is probably a dictionary hash collision. Subtract those times from the times for calling those objects, and compare the result against the times for using literals:

>>> timeit.timeit('[]', number=10**7)
0.30478692054748535
>>> timeit.timeit('{}', number=10**7)
0.31482696533203125
>>> timeit.timeit('list()', number=10**7)
0.9991960525512695
>>> timeit.timeit('dict()', number=10**7)
1.0200958251953125

So having to call the object takes an additional 1.00 - 0.31 - 0.30 == 0.39 seconds per 10 million calls.

You can avoid the global lookup cost by aliasing the global names as locals (using a timeit setup, everything you bind to a name is a local):

>>> timeit.timeit('_list', '_list = list', number=10**7)
0.1866450309753418
>>> timeit.timeit('_dict', '_dict = dict', number=10**7)
0.19016098976135254
>>> timeit.timeit('_list()', '_list = list', number=10**7)
0.841480016708374
>>> timeit.timeit('_dict()', '_dict = dict', number=10**7)
0.7233691215515137

but you never can overcome that CALL_FUNCTION cost.

Solution 2:

list() requires a global lookup and a function call but [] compiles to a single instruction. See:

Python 2.7.3
>>> import dis
>>> dis.dis(lambda: list())
  1           0 LOAD_GLOBAL              0 (list)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        
>>> dis.dis(lambda: [])
  1           0 BUILD_LIST               0
              3 RETURN_VALUE

Solution 3:

Because list is a function to convert say a string to a list object, while [] is used to create a list off the bat. Try this (might make more sense to you):

x = "wham bam"
a = list(x)
>>> a
["w", "h", "a", "m", ...]

While

y = ["wham bam"]
>>> y
["wham bam"]

Gives you a actual list containing whatever you put in it.

Solution 4:

The answers here are great, to the point and fully cover this question. I'll drop a further step down from byte-code for those interested. I'm using the most recent repo of CPython; older versions behave similar in this regard but slight changes might be in place.

Here's a break down of the execution for each of these, BUILD_LIST for [] and CALL_FUNCTION for list().

The `BUILD_LIST` instruction:

You should just view the horror:

PyObject *list =  PyList_New(oparg);
if (list == NULL)
    goto error;
while (--oparg >= 0) {
    PyObject *item = POP();
    PyList_SET_ITEM(list, oparg, item);
}
PUSH(list);
DISPATCH();

Terribly convoluted, I know. This is how simple it is:

Create a new list with PyList_New (this mainly allocates the memory for a new list object), oparg signalling the number of arguments on the stack. Straight to the point.
Check that nothing went wrong with if (list==NULL).
Add any arguments (in our case this isn't executed) located on the stack with PyList_SET_ITEM (a macro).

No wonder it is fast! It's custom-made for creating new lists, nothing else :-)

The `CALL_FUNCTION` instruction:

Here's the first thing you see when you peek at the code handling CALL_FUNCTION:

PyObject **sp, *res;
sp = stack_pointer;
res = call_function(&sp, oparg, NULL);
stack_pointer = sp;
PUSH(res);
if (res == NULL) {
    goto error;
}
DISPATCH();

Looks pretty harmless, right? Well, no, unfortunately not, call_function is not a straightforward guy that will call the function immediately, it can't. Instead, it grabs the object from the stack, grabs all arguments of the stack and then switches based on the type of the object; is it a:

PyCFunction_Type? Nope, it is list, list isn't of type PyCFunction
PyMethodType? Nope, see previous.
PyFunctionType? Nopee, see previous.

We're calling the list type, the argument passed in to call_function is PyList_Type. CPython now has to call a generic function to handle any callable objects named _PyObject_FastCallKeywords, yay more function calls.

This function again makes some checks for certain function types (which I cannot understand why) and then, after creating a dict for kwargs if required, goes on to call _PyObject_FastCallDict.

_PyObject_FastCallDict finally gets us somewhere! After performing even more checks it grabs the tp_call slot from the type of the type we've passed in, that is, it grabs type.tp_call. It then proceeds to create a tuple out of of the arguments passed in with _PyStack_AsTuple and, finally, a call can finally be made!

tp_call, which matches type.__call__ takes over and finally creates the list object. It calls the lists __new__ which corresponds to PyType_GenericNew and allocates memory for it with PyType_GenericAlloc: This is actually the part where it catches up with PyList_New, finally. All the previous are necessary to handle objects in a generic fashion.

In the end, type_call calls list.__init__ and initializes the list with any available arguments, then we go on a returning back the way we came. :-)

Finally, remmeber the LOAD_NAME, that's another guy that contributes here.

It's easy to see that, when dealing with our input, Python generally has to jump through hoops in order to actually find out the appropriate C function to do the job. It doesn't have the curtesy of immediately calling it because it's dynamic, someone might mask list (and boy do many people do) and another path must be taken.

This is where list() loses much: The exploring Python needs to do to find out what the heck it should do.

Literal syntax, on the other hand, means exactly one thing; it cannot be changed and always behaves in a pre-determined way.

^{Footnote: All function names are subject to change from one release to the other. The point still stands and most likely will stand in any future versions, it's the dynamic look-up that slows things down.}

Why is [] faster than list()?

Solution 1:

Solution 2:

Solution 3:

Solution 4:

The `BUILD_LIST` instruction:

The `CALL_FUNCTION` instruction:

Related

Recent Posts

Why is [] faster than list()?

Solution 1:

Solution 2:

Solution 3:

Solution 4:

The BUILD_LIST instruction:

The CALL_FUNCTION instruction:

Related

Recent Posts

The `BUILD_LIST` instruction:

The `CALL_FUNCTION` instruction: