Length of generator output [duplicate]
Solution 1:
The easiest way is probably just sum(1 for _ in gen)
where gen is your generator.
Solution 2:
For those who would like to know the summary of that discussion. The final top scores for counting a 50 million-lengthed generator expression using:
-
len(list(gen))
, -
len([_ for _ in gen])
, sum(1 for _ in gen),
-
ilen(gen)
(from more_itertools), -
reduce(lambda c, i: c + 1, gen, 0)
,
sorted by performance of execution (including memory consumption), will make you surprised:
#1: test_list.py:8: 0.492 KiB
gen = (i for i in data*1000); t0 = monotonic(); len(list(gen))
('list, sec', 1.9684218849870376)
#2: test_list_compr.py:8: 0.867 KiB
gen = (i for i in data*1000); t0 = monotonic(); len([i for i in gen])
('list_compr, sec', 2.5885991149989422)
#3: test_sum.py:8: 0.859 KiB
gen = (i for i in data*1000); t0 = monotonic(); sum(1 for i in gen); t1 = monotonic()
('sum, sec', 3.441088170016883)
#4: more_itertools/more.py:413: 1.266 KiB
d = deque(enumerate(iterable, 1), maxlen=1)
test_ilen.py:10: 0.875 KiB
gen = (i for i in data*1000); t0 = monotonic(); ilen(gen)
('ilen, sec', 9.812256851990242)
#5: test_reduce.py:8: 0.859 KiB
gen = (i for i in data*1000); t0 = monotonic(); reduce(lambda counter, i: counter + 1, gen, 0)
('reduce, sec', 13.436614598002052)
So, len(list(gen))
is the most frequent and less memory consumable
Solution 3:
There isn't one because you can't do it in the general case - what if you have a lazy infinite generator? For example:
def fib():
a, b = 0, 1
while True:
a, b = b, a + b
yield a
This never terminates but will generate the Fibonacci numbers. You can get as many Fibonacci numbers as you want by calling next()
.
If you really need to know the number of items there are, then you can't iterate through them linearly one time anyway, so just use a different data structure such as a regular list.
Solution 4:
def count(iter):
return sum(1 for _ in iter)
Or better yet:
def count(iter):
try:
return len(iter)
except TypeError:
return sum(1 for _ in iter)
If it's not iterable, it will throw a TypeError
.
Or, if you want to count something specific in the generator:
def count(iter, key=None):
if key:
if callable(key):
return sum(bool(key(x)) for x in iter)
return sum(x == key for x in iter)
try:
return len(iter)
except TypeError:
return sum(1 for _ in iter)