Initializing a list to a known number of elements in Python [duplicate]

Right now I am using a list, and was expecting something like:

verts = list (1000)

Should I use array instead?


Solution 1:

The first thing that comes to mind for me is:

verts = [None]*1000

But do you really need to preinitialize it?

Solution 2:

Not quite sure why everyone is giving you a hard time for wanting to do this - there are several scenarios where you'd want a fixed size initialised list. And you've correctly deduced that arrays are sensible in these cases.

import array
verts=array.array('i',(0,)*1000)

For the non-pythonistas, the (0,)*1000 term is creating a tuple containing 1000 zeros. The comma forces python to recognise (0) as a tuple, otherwise it would be evaluated as 0.

I've used a tuple instead of a list because they are generally have lower overhead.

Solution 3:

One obvious and probably not efficient way is

verts = [0 for x in range(1000)]

Note that this can be extended to 2-dimension easily. For example, to get a 10x100 "array" you can do

verts = [[0 for x in range(100)] for y in range(10)]

Solution 4:

Wanting to initalize an array of fixed size is a perfectly acceptable thing to do in any programming language; it isn't like the programmer wants to put a break statement in a while(true) loop. Believe me, especially if the elements are just going to be overwritten and not merely added/subtracted, like is the case of many dynamic programming algorithms, you don't want to mess around with append statements and checking if the element hasn't been initialized yet on the fly (that's a lot of code gents).

object = [0 for x in range(1000)]

This will work for what the programmer is trying to achieve.

Solution 5:

@Steve already gave a good answer to your question:

verts = [None] * 1000

Warning: As @Joachim Wuttke pointed out, the list must be initialized with an immutable element. [[]] * 1000 does not work as expected because you will get a list of 1000 identical lists (similar to a list of 1000 points to the same list in C). Immutable objects like int, str or tuple will do fine.

Alternatives

Resizing lists is slow. The following results are not very surprising:

>>> N = 10**6

>>> %timeit a = [None] * N
100 loops, best of 3: 7.41 ms per loop

>>> %timeit a = [None for x in xrange(N)]
10 loops, best of 3: 30 ms per loop

>>> %timeit a = [None for x in range(N)]
10 loops, best of 3: 67.7 ms per loop

>>> a = []
>>> %timeit for x in xrange(N): a.append(None)
10 loops, best of 3: 85.6 ms per loop

But resizing is not very slow if you don't have very large lists. Instead of initializing the list with a single element (e.g. None) and a fixed length to avoid list resizing, you should consider using list comprehensions and directly fill the list with correct values. For example:

>>> %timeit a = [x**2 for x in xrange(N)]
10 loops, best of 3: 109 ms per loop

>>> def fill_list1():
    """Not too bad, but complicated code"""
    a = [None] * N
    for x in xrange(N):
        a[x] = x**2
>>> %timeit fill_list1()
10 loops, best of 3: 126 ms per loop

>>> def fill_list2():
    """This is slow, use only for small lists"""
    a = []
    for x in xrange(N):
        a.append(x**2)
>>> %timeit fill_list2()
10 loops, best of 3: 177 ms per loop

Comparison to numpy

For huge data set numpy or other optimized libraries are much faster:

from numpy import ndarray, zeros
%timeit empty((N,))
1000000 loops, best of 3: 788 ns per loop

%timeit zeros((N,))
100 loops, best of 3: 3.56 ms per loop