Move all zeroes to the beginning of a list in Python

I have a list like below:

a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]

and I want to push all zeroes to the beginning of that list. The result must be like below.

a = [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

How to do it in Python 2?


You could sort the list:

a.sort(key=lambda v: v != 0)

The key function tells Python to sort values by wether or not they are 0. False is sorted before True, and values are then sorted based on their original relative position.

For 0, False is returned, sorting all those values first. For the rest True is returned, leaving sort to put them last but leave their relative positions untouched.

Demo:

>>> a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]
>>> a.sort(key=lambda v: v != 0)
>>> a
[0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

This can be done without sorting.

Solutions

Initialization:

In [8]: a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]

In [9]: from itertools import compress, repeat, chain

list.count and itertools.compress

In [10]: x = [0] * a.count(0); x.extend(compress(a, a))

In [11]: x
Out[11]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

Same as before, but without list.count

In [12]: c = list(compress(a, a)); [0] * (len(a) - len(c)) + c
Out[12]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

list.count, itertools.compress, itertools.repeat, itertools.chain

In [13]: list(chain(repeat(0, a.count(0)), compress(a, a)))
Out[13]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

Same as the previous one, but without list.count

In [14]: c = list(compress(a, a)); list(chain(repeat(0, len(a) - len(c)), c))
Out[14]: [0, 0, 0, 0, 4, 5, 6, 7, 1, 5]

Benchmarks

For small lists:

In [21]: %timeit x = [0] * a.count(0); x.extend(compress(a, a))
1000000 loops, best of 3: 583 ns per loop

In [22]: %timeit c = list(compress(a, a)); [0] * (len(a) - len(c)) + c
1000000 loops, best of 3: 661 ns per loop

In [23]: %timeit list(chain(repeat(0, a.count(0)), compress(a, a)))
1000000 loops, best of 3: 762 ns per loop

In [24]: %timeit c = list(compress(a, a)); list(chain(repeat(0, len(a) - len(c)), c))
1000000 loops, best of 3: 900 ns per loop

For large lists:

In [28]: a *= 10000000

In [29]: %timeit x = [0] * a.count(0); x.extend(compress(a, a))
1 loops, best of 3: 1.43 s per loop

In [30]: %timeit c = list(compress(a, a)); [0] * (len(a) - len(c)) + c
1 loops, best of 3: 1.37 s per loop

In [31]: %timeit list(chain(repeat(0, a.count(0)), compress(a, a)))
1 loops, best of 3: 1.79 s per loop

In [32]: %timeit c = list(compress(a, a)); list(chain(repeat(0, len(a) - len(c)), c))
1 loops, best of 3: 1.47 s per loop

As you can see, in some cases itertools-based solutions tend to be slower, because of the big number of function calls.


Here are some better timings, with two new methods:

SETUP="
from itertools import compress, repeat, chain
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5]
"

First the sorting:

python -m timeit -s "$SETUP" "a.sort(key=bool)"
# 1000000 loops, best of 3: 1.51 usec per loop

Then frostnational's methods:

python -m timeit -s "$SETUP" "list(chain(repeat(0, a.count(0)), compress(a, a)))"
# 1000000 loops, best of 3: 1.16 usec per loop

python -m timeit -s "$SETUP" "cs = list(compress(a, a)); list(chain(repeat(0, len(a)-len(cs)), cs))"
# 1000000 loops, best of 3: 1.37 usec per loop

Then methods more directly working from lists:

python -m timeit -s "$SETUP" "[0] * a.count(0) + list(filter(bool, a))"
# 1000000 loops, best of 3: 1.04 usec per loop

python -m timeit -s "$SETUP" "nonzero = list(filter(bool, a)); [0] * (len(a)-len(nonzero)) + nonzero"
# 1000000 loops, best of 3: 0.87 usec per loop

And again with larger sized input:

SETUP="
from itertools import compress, repeat, chain
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5] * 1000
"

Sorting:

python -m timeit -s "$SETUP" "a.sort(key=bool)"
# 1000 loops, best of 3: 1.08 msec per loop

frostnational's:

python -m timeit -s "$SETUP" "list(chain(repeat(0, a.count(0)), compress(a, a)))"
# 1000 loops, best of 3: 333 usec per loop

python -m timeit -s "$SETUP" "cs = list(compress(a, a)); list(chain(repeat(0, len(a)-len(cs)), cs))"
# 1000 loops, best of 3: 206 usec per loop

New:

python -m timeit -s "$SETUP" "[0] * a.count(0) + list(filter(bool, a))"
# 1000 loops, best of 3: 295 usec per loop

python -m timeit -s "$SETUP" "nonzero = list(filter(bool, a)); [0] * (len(a)-len(nonzero)) + nonzero"
# 10000 loops, best of 3: 143 usec per loop

That said, despite being relatively slow, Martijn Pieters' decision to use sorting is actually pretty competitive for reasonably-sized lists and premature optimisation is the root of all evil.


FWIW, here are some timings for very long lists:

SETUP="
from itertools import compress, repeat, chain
a = [4, 5, 0, 0, 6, 7, 0, 1, 0, 5] * 1000000
"
python -m timeit -s "$SETUP" "a.sort(key=bool)"
# 10 loops, best of 3: 1.21 sec per loop

python -m timeit -s "$SETUP" "list(chain(repeat(0, a.count(0)), compress(a, a)))"
# 10 loops, best of 3: 347 msec per loop

python -m timeit -s "$SETUP" "cs = list(compress(a, a)); list(chain(repeat(0, len(a)-len(cs)), cs))"
# 10 loops, best of 3: 226 msec per loop

python -m timeit -s "$SETUP" "[0] * a.count(0) + list(filter(bool, a))"
# 10 loops, best of 3: 310 msec per loop

python -m timeit -s "$SETUP" "nonzero = list(filter(bool, a)); [0] * (len(a)-len(nonzero)) + nonzero"
# 10 loops, best of 3: 153 msec per loop