Finding the indices of matching elements in list in Python
I have a long list of float numbers ranging from 1 to 5, called "average", and I want to return the list of indices for elements that are smaller than a or larger than b
def find(lst,a,b):
result = []
for x in lst:
if x<a or x>b:
i = lst.index(x)
result.append(i)
return result
matches = find(average,2,4)
But surprisingly, the output for "matches" has a lot of repetitions in it, e.g. [2, 2, 10, 2, 2, 2, 19, 2, 10, 2, 2, 42, 2, 2, 10, 2, 2, 2, 10, 2, 2, ...]
.
Why is this happening?
Solution 1:
You are using .index()
which will only find the first occurrence of your value in the list. So if you have a value 1.0 at index 2, and at index 9, then .index(1.0)
will always return 2
, no matter how many times 1.0
occurs in the list.
Use enumerate()
to add indices to your loop instead:
def find(lst, a, b):
result = []
for i, x in enumerate(lst):
if x<a or x>b:
result.append(i)
return result
You can collapse this into a list comprehension:
def find(lst, a, b):
return [i for i, x in enumerate(lst) if x<a or x>b]
Solution 2:
if you're doing a lot of this kind of thing you should consider using numpy
.
In [56]: import random, numpy
In [57]: lst = numpy.array([random.uniform(0, 5) for _ in range(1000)]) # example list
In [58]: a, b = 1, 3
In [59]: numpy.flatnonzero((lst > a) & (lst < b))[:10]
Out[59]: array([ 0, 12, 13, 15, 18, 19, 23, 24, 26, 29])
In response to Seanny123's question, I used this timing code:
import numpy, timeit, random
a, b = 1, 3
lst = numpy.array([random.uniform(0, 5) for _ in range(1000)])
def numpy_way():
numpy.flatnonzero((lst > 1) & (lst < 3))[:10]
def list_comprehension():
[e for e in lst if 1 < e < 3][:10]
print timeit.timeit(numpy_way)
print timeit.timeit(list_comprehension)
The numpy version is over 60 times faster.