Multiprocessing Bomb
I was working the following example from Doug Hellmann tutorial on multiprocessing:
import multiprocessing
def worker():
"""worker function"""
print 'Worker'
return
if __name__ == '__main__':
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
When I tried to run it outside the if statement:
import multiprocessing
def worker():
"""worker function"""
print 'Worker'
return
jobs = []
for i in range(5):
p = multiprocessing.Process(target=worker)
jobs.append(p)
p.start()
It started spawning processes non-stop, and the only way to stop it was reboot!
Why would that happen? Why it did not generate 5 processes and exit? Why do I need the if statement?
Solution 1:
On Windows there is no fork()
routine, so multiprocessing
imports the current module to get access to the worker
function. Without the if
statement the child process starts its own children and so on.
Solution 2:
Note that the documentation mentions that you need the if
statement on windows (here).
However, the documentation doesn't say that this kills your machine almost instantly, requiring a reboot. So this can be quite confusing, especially if the use of multiprocessing
happens in some function deep inside the code. No matter how deeply hidden it is, you still need the if
check in the main program file. This pretty much rules out using multiprocessing
in any kind of library.
multiprocessing
in general seems a bit rough. It might have the interface of the thread interface, but there is just no simple way around the GIL.
For more complex parallelization problems I would also look at the subprocess
module or some other libraries (like mpi4py or Parallel Python).
Solution 3:
I don't know about multiprocessing
, but I suspect that it spawns child processes that have a different __name__
global. By removing the test, you are making every child start the spawning process again.