How to parallelize list-comprehension calculations in Python?
Both list comprehensions and map-calculations should -- at least in theory -- be relatively easy to parallelize: each calculation inside a list-comprehension could be done independent of the calculation of all the other elements. For example in the expression
[ x*x for x in range(1000) ]
each x*x-Calculation could (at least in theory) be done in parallel.
My question is: Is there any Python-Module / Python-Implementation / Python Programming-Trick to parallelize a list-comprehension calculation (in order to use all 16 / 32 / ... cores or distribute the calculation over a Computer-Grid or over a Cloud)?
Solution 1:
As Ken said, it can't, but with 2.6's multiprocessing module, it's pretty easy to parallelize computations.
import multiprocessing
try:
cpus = multiprocessing.cpu_count()
except NotImplementedError:
cpus = 2 # arbitrary default
def square(n):
return n * n
pool = multiprocessing.Pool(processes=cpus)
print(pool.map(square, range(1000)))
There are also examples in the documentation that show how to do this using Managers, which should allow for distributed computations as well.
Solution 2:
For shared-memory parallelism, I recommend joblib:
from joblib import delayed, Parallel
def square(x): return x*x
values = Parallel(n_jobs=NUM_CPUS)(delayed(square)(x) for x in range(1000))