How to parallelize list-comprehension calculations in Python?

Both list comprehensions and map-calculations should -- at least in theory -- be relatively easy to parallelize: each calculation inside a list-comprehension could be done independent of the calculation of all the other elements. For example in the expression

[ x*x for x in range(1000) ]

each x*x-Calculation could (at least in theory) be done in parallel.

My question is: Is there any Python-Module / Python-Implementation / Python Programming-Trick to parallelize a list-comprehension calculation (in order to use all 16 / 32 / ... cores or distribute the calculation over a Computer-Grid or over a Cloud)?

Solution 1:

As Ken said, it can't, but with 2.6's multiprocessing module, it's pretty easy to parallelize computations.

import multiprocessing

try:
    cpus = multiprocessing.cpu_count()
except NotImplementedError:
    cpus = 2   # arbitrary default


def square(n):
    return n * n

pool = multiprocessing.Pool(processes=cpus)
print(pool.map(square, range(1000)))

There are also examples in the documentation that show how to do this using Managers, which should allow for distributed computations as well.

Solution 2:

For shared-memory parallelism, I recommend joblib:

from joblib import delayed, Parallel

def square(x): return x*x
values = Parallel(n_jobs=NUM_CPUS)(delayed(square)(x) for x in range(1000))

How to parallelize list-comprehension calculations in Python?

Solution 1:

Solution 2:

Related

Recent Posts