Finding the indices of matching elements in list in Python

I have a long list of float numbers ranging from 1 to 5, called "average", and I want to return the list of indices for elements that are smaller than a or larger than b

def find(lst,a,b):
    result = []
    for x in lst:
        if x<a or x>b:
            i = lst.index(x)
            result.append(i)
    return result

matches = find(average,2,4)

But surprisingly, the output for "matches" has a lot of repetitions in it, e.g. [2, 2, 10, 2, 2, 2, 19, 2, 10, 2, 2, 42, 2, 2, 10, 2, 2, 2, 10, 2, 2, ...].

Why is this happening?


Solution 1:

You are using .index() which will only find the first occurrence of your value in the list. So if you have a value 1.0 at index 2, and at index 9, then .index(1.0) will always return 2, no matter how many times 1.0 occurs in the list.

Use enumerate() to add indices to your loop instead:

def find(lst, a, b):
    result = []
    for i, x in enumerate(lst):
        if x<a or x>b:
            result.append(i)
    return result

You can collapse this into a list comprehension:

def find(lst, a, b):
    return [i for i, x in enumerate(lst) if x<a or x>b]

Solution 2:

if you're doing a lot of this kind of thing you should consider using numpy.

In [56]: import random, numpy

In [57]: lst = numpy.array([random.uniform(0, 5) for _ in range(1000)]) # example list

In [58]: a, b = 1, 3

In [59]: numpy.flatnonzero((lst > a) & (lst < b))[:10]
Out[59]: array([ 0, 12, 13, 15, 18, 19, 23, 24, 26, 29])

In response to Seanny123's question, I used this timing code:

import numpy, timeit, random

a, b = 1, 3

lst = numpy.array([random.uniform(0, 5) for _ in range(1000)])

def numpy_way():
    numpy.flatnonzero((lst > 1) & (lst < 3))[:10]

def list_comprehension():
    [e for e in lst if 1 < e < 3][:10]

print timeit.timeit(numpy_way)
print timeit.timeit(list_comprehension)

The numpy version is over 60 times faster.