Are list-comprehensions and functional functions faster than "for loops"?
The following are rough guidelines and educated guesses based on experience. You should timeit
or profile your concrete use case to get hard numbers, and those numbers may occasionally disagree with the below.
A list comprehension is usually a tiny bit faster than the precisely equivalent for
loop (that actually builds a list), most likely because it doesn't have to look up the list and its append
method on every iteration. However, a list comprehension still does a bytecode-level loop:
>>> dis.dis(<the code object for `[x for x in range(10)]`>)
1 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
Using a list comprehension in place of a loop that doesn't build a list, nonsensically accumulating a list of meaningless values and then throwing the list away, is often slower because of the overhead of creating and extending the list. List comprehensions aren't magic that is inherently faster than a good old loop.
As for functional list processing functions: While these are written in C and probably outperform equivalent functions written in Python, they are not necessarily the fastest option. Some speed up is expected if the function is written in C too. But most cases using a lambda
(or other Python function), the overhead of repeatedly setting up Python stack frames etc. eats up any savings. Simply doing the same work in-line, without function calls (e.g. a list comprehension instead of map
or filter
) is often slightly faster.
Suppose that in a game that I'm developing I need to draw complex and huge maps using for loops. This question would be definitely relevant, for if a list-comprehension, for example, is indeed faster, it would be a much better option in order to avoid lags (Despite the visual complexity of the code).
Chances are, if code like this isn't already fast enough when written in good non-"optimized" Python, no amount of Python level micro optimization is going to make it fast enough and you should start thinking about dropping to C. While extensive micro optimizations can often speed up Python code considerably, there is a low (in absolute terms) limit to this. Moreover, even before you hit that ceiling, it becomes simply more cost efficient (15% speedup vs. 300% speed up with the same effort) to bite the bullet and write some C.
If you check the info on python.org, you can see this summary:
Version Time (seconds)
Basic loop 3.47
Eliminate dots 2.45
Local variable & no dots 1.79
Using map function 0.54
But you really should read the above article in details to understand the cause of the performance difference.
I also strongly suggest you should time your code by using timeit. At the end of the day, there can be a situation where, for example, you may need to break out of for
loop when a condition is met. It could potentially be faster than finding out the result by calling map
.
You ask specifically about map()
, filter()
and reduce()
, but I assume you want to know about functional programming in general. Having tested this myself on the problem of computing distances between all points within a set of points, functional programming (using the starmap
function from the built-in itertools
module) turned out to be slightly slower than for-loops (taking 1.25 times as long, in fact). Here is the sample code I used:
import itertools, time, math, random
class Point:
def __init__(self,x,y):
self.x, self.y = x, y
point_set = (Point(0, 0), Point(0, 1), Point(0, 2), Point(0, 3))
n_points = 100
pick_val = lambda : 10 * random.random() - 5
large_set = [Point(pick_val(), pick_val()) for _ in range(n_points)]
# the distance function
f_dist = lambda x0, x1, y0, y1: math.sqrt((x0 - x1) ** 2 + (y0 - y1) ** 2)
# go through each point, get its distance from all remaining points
f_pos = lambda p1, p2: (p1.x, p2.x, p1.y, p2.y)
extract_dists = lambda x: itertools.starmap(f_dist,
itertools.starmap(f_pos,
itertools.combinations(x, 2)))
print('Distances:', list(extract_dists(point_set)))
t0_f = time.time()
list(extract_dists(large_set))
dt_f = time.time() - t0_f
Is the functional version faster than the procedural version?
def extract_dists_procedural(pts):
n_pts = len(pts)
l = []
for k_p1 in range(n_pts - 1):
for k_p2 in range(k_p1, n_pts):
l.append((pts[k_p1].x - pts[k_p2].x) ** 2 +
(pts[k_p1].y - pts[k_p2].y) ** 2)
return l
t0_p = time.time()
list(extract_dists_procedural(large_set))
# using list() on the assumption that
# it eats up as much time as in the functional version
dt_p = time.time() - t0_p
f_vs_p = dt_p / dt_f
if f_vs_p >= 1.0:
print('Time benefit of functional progamming:', f_vs_p,
'times as fast for', n_points, 'points')
else:
print('Time penalty of functional programming:', 1 / f_vs_p,
'times as slow for', n_points, 'points')
I wrote a simple script that test the speed and this is what I found out. Actually for loop was fastest in my case. That really suprised me, check out bellow (was calculating sum of squares).
from functools import reduce
import datetime
def time_it(func, numbers, *args):
start_t = datetime.datetime.now()
for i in range(numbers):
func(args[0])
print (datetime.datetime.now()-start_t)
def square_sum1(numbers):
return reduce(lambda sum, next: sum+next**2, numbers, 0)
def square_sum2(numbers):
a = 0
for i in numbers:
i = i**2
a += i
return a
def square_sum3(numbers):
sqrt = lambda x: x**2
return sum(map(sqrt, numbers))
def square_sum4(numbers):
return(sum([int(i)**2 for i in numbers]))
time_it(square_sum1, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum2, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum3, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
time_it(square_sum4, 100000, [1, 2, 5, 3, 1, 2, 5, 3])
0:00:00.302000 #Reduce
0:00:00.144000 #For loop
0:00:00.318000 #Map
0:00:00.390000 #List comprehension