Get number of items from list (or other iterable) with certain condition

Assuming that I have a list with a huge number of items,

l = [ 1, 4, 6, 30, 2, ... ]

I want to get the number of items from that list, where an item satisfies a certain condition. My first thought was:

count = len([i for i in l if my_condition(l)])

But if the filtered list also has a great number of items, I think that creating a new list for the filtered result is just a waste of memory. For efficiency, IMHO, the above call can't be better than:

count = 0
for i in l:
    if my_condition(l):
        count += 1

Is there any functional-style way to get the # of items that satisfy the condition without generating a temporary list?


Solution 1:

You can use a generator expression:

>>> l = [1, 3, 7, 2, 6, 8, 10]
>>> sum(1 for i in l if i % 4 == 3)
2

or even

>>> sum(i % 4 == 3 for i in l)
2

which uses the fact that True == 1 and False == 0.

Alternatively, you could use itertools.imap (python 2) or simply map (python 3):

>>> def my_condition(x):
...     return x % 4 == 3
... 
>>> sum(map(my_condition, l))
2

Solution 2:

You want a generator comprehension rather than a list here.

For example,

l = [1, 4, 6, 7, 30, 2]

def my_condition(x):
    return x > 5 and x < 20

print sum(1 for x in l if my_condition(x))
# -> 2
print sum(1 for x in range(1000000) if my_condition(x))
# -> 14

Or use itertools.imap (though I think the explicit list and generator expressions look somewhat more Pythonic).

Note that, though it's not obvious from the sum example, you can compose generator comprehensions nicely. For example,

inputs = xrange(1000000)      # In Python 3 and above, use range instead of xrange
odds = (x for x in inputs if x % 2)  # Pick odd numbers
sq_inc = (x**2 + 1 for x in odds)    # Square and add one
print sum(x/2 for x in sq_inc)       # Actually evaluate each one
# -> 83333333333500000

The cool thing about this technique is that you can specify conceptually separate steps in code without forcing evaluation and storage in memory until the final result is evaluated.

Solution 3:

This can also be done using reduce if you prefer functional programming

reduce(lambda count, i: count + my_condition(i), l, 0)

This way you only do 1 pass and no intermediate list is generated.

Solution 4:

you could do something like:

l = [1,2,3,4,5,..]
count = sum(1 for i in l if my_condition(i))

which just adds 1 for each element that satisfies the condition.