Extract first item of each sublist
I am wondering what is the best way to extract the first item of each sublist in a list of lists and append it to a new list. So if I have:
lst = [[a,b,c], [1,2,3], [x,y,z]]
and I want to pull out a
, 1
and x
and create a separate list from those.
I tried:
lst2.append(x[0] for x in lst)
Using list comprehension:
>>> lst = [['a','b','c'], [1,2,3], ['x','y','z']]
>>> lst2 = [item[0] for item in lst]
>>> lst2
['a', 1, 'x']
You could use zip:
>>> lst=[[1,2,3],[11,12,13],[21,22,23]]
>>> zip(*lst)[0]
(1, 11, 21)
Or, Python 3 where zip
does not produce a list:
>>> list(zip(*lst))[0]
(1, 11, 21)
Or,
>>> next(zip(*lst))
(1, 11, 21)
Or, (my favorite) use numpy:
>>> import numpy as np
>>> a=np.array([[1,2,3],[11,12,13],[21,22,23]])
>>> a
array([[ 1, 2, 3],
[11, 12, 13],
[21, 22, 23]])
>>> a[:,0]
array([ 1, 11, 21])
Had the same issue and got curious about the performance of each solution.
Here's is the %timeit
:
import numpy as np
lst = [['a','b','c'], [1,2,3], ['x','y','z']]
The first numpy-way, transforming the array:
%timeit list(np.array(lst).T[0])
4.9 µs ± 163 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Fully native using list comprehension (as explained by @alecxe):
%timeit [item[0] for item in lst]
379 ns ± 23.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Another native way using zip
(as explained by @dawg):
%timeit list(zip(*lst))[0]
585 ns ± 7.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Second numpy-way. Also explained by @dawg:
%timeit list(np.array(lst)[:,0])
4.95 µs ± 179 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Surprisingly (well, at least for me) the native way using list comprehension is the fastest and about 10x faster than the numpy-way. Running the two numpy-ways without the final list
saves about one µs which is still in the 10x difference.
Note that, when I surrounded each code snippet with a call to len
, to ensure that Generators run till the end, the timing stayed the same.