Multiprocessing Bomb

I was working the following example from Doug Hellmann tutorial on multiprocessing:

import multiprocessing

def worker():
    """worker function"""
    print 'Worker'
    return

if __name__ == '__main__':
    jobs = []
    for i in range(5):
        p = multiprocessing.Process(target=worker)
        jobs.append(p)
        p.start()

When I tried to run it outside the if statement:

import multiprocessing

def worker():
    """worker function"""
    print 'Worker'
    return

jobs = []
for i in range(5):
    p = multiprocessing.Process(target=worker)
    jobs.append(p)
    p.start()

It started spawning processes non-stop, and the only way to stop it was reboot!

Why would that happen? Why it did not generate 5 processes and exit? Why do I need the if statement?


Solution 1:

On Windows there is no fork() routine, so multiprocessing imports the current module to get access to the worker function. Without the if statement the child process starts its own children and so on.

Solution 2:

Note that the documentation mentions that you need the if statement on windows (here).

However, the documentation doesn't say that this kills your machine almost instantly, requiring a reboot. So this can be quite confusing, especially if the use of multiprocessing happens in some function deep inside the code. No matter how deeply hidden it is, you still need the if check in the main program file. This pretty much rules out using multiprocessing in any kind of library.

multiprocessing in general seems a bit rough. It might have the interface of the thread interface, but there is just no simple way around the GIL.

For more complex parallelization problems I would also look at the subprocess module or some other libraries (like mpi4py or Parallel Python).

Solution 3:

I don't know about multiprocessing, but I suspect that it spawns child processes that have a different __name__ global. By removing the test, you are making every child start the spawning process again.