multiprocessing.Pool seems to work in Windows but not in ubuntu?

SOLVED: The problem was Wingware Python IDE. I guess the natural question now is how it is possible and how this could be fixed.

I asked a question yesterday ( Problem with multiprocessing.Pool in Python ) and this question is almost the same but I have figured out that it seems to work on a Windows computer and not in my ubuntu. At the end of this post I will post a slightly different version of the code that does the same thing.

Short summary of my problem: When using multiprocessing.Pool in Python I am not always able to get the amount of workers that I am asking for. When this happens, the program just stalls.

I have been working for a solution all day, and then I came to think about Noahs' comment on my previous question. He said that it worked on his machine so I gave the code to my colleague who runs a Windows machine with Enthoughts 64-bit Python 2.7.1 distribution. I have the same with the big difference that mine runs on ubuntu. I also mention that we both have Wingware Python IDE, but I doubt that this is of any importance?

There are two problems with my code that don't arise when my colleague runs the code on his machine.

  1. I am not always able to get the four workers I am asking for (Although my machine has 12 workers). When this happens, the process just stalls and does not continue. No exception or Error is raised.

  2. When I am able to get the four workers I ask for (which happens approximately 1 out 5 times or so), the figures that are produced (plain random numbers) are EXACTLY the same for all four pictures. This is not the case for my colleague.

Something is very fishy and I am very thankful for any kind of help you guys can offer.

The code:

import multiprocessing as mp
import scipy as sp
import scipy.stats as spstat
import pylab

def testfunc(x0, N):
    print 'working with x0 = %s' % x0
    x = [x0]
    for i in xrange(1,N):
        x.append(spstat.norm.rvs(size = 1)) # stupid appending to make it slower
        if i % 10000 == 0:
            print 'x0 = %s, i = %s' % (x0, i)
    return sp.array(x)

def testfuncParallel(fargs):
    return testfunc(*fargs)


# Define Number of tasks.
nTasks = 4
N = 100000

if __name__ == '__main__':

    """
    Try number 1. Using multiprocessing.Pool together with Pool.map_async
    """
    pool = mp.Pool(processes = nTasks) # I have 12 threads (six cores) available so I am suprised that it does not get access to nTasks = 4 amount of workers

    # Define tasks:
    tasks = [(x, n) for x, n in enumerate(nTasks*[N])] # nTasks different tasks

    # Compute parallel: async - asynchronically, i.e. not necessary in order.
    result = pool.map_async(testfuncParallel, tasks)

    pool.close() # These are needed if map_async is used
    pool.join()

    # Get results:
    sim = sp.zeros((N, nTasks)) 

    for nn, res in enumerate(result.get()):    
        sim[:, nn] = res

    pylab.figure()
    for i in xrange(nTasks):
        pylab.subplot(nTasks,1, i + 1)
        pylab.plot(sim[:, i])

    pylab.show()

Thanks in advance.

Sincerely, Matias


Solution 1:

I don't have a solution for your first problem. In fact, I can run your code repeatedly without fail on my 64-bit Ubuntu box with Enthought's Python 2.7.1 [EPD 7.0-2 (64-bit)]. edit: It turns out the problem was being caused by your IDE (Wingware). The obvious workaround is to run the script from outside the IDE.

As to the second question, what happens is that on Unix every worker process inherits the same state of the random number generator from the parent process. This is why they generate identical pseudo-random sequences. All you have to do to fix this is call scipy.random.seed at the top of testfunc:

def testfunc(x0, N):
    sp.random.seed()
    print 'working with x0 = %s' % x0
    ...

Solution 2:

Update: Turns out this had nothing to do with matplotlib or the backends but rather with a bug associated with multiprocessing in general. We've fixed this for Wing version 4.0.4+. The work-around is not to set breakpoints in the code that is executed in the sub-processes.

It seems to be Wing IDE's matplotlib support for the Tkinter backend interacting badly with multiprocessing. When I try this example it crashes in TCL/Tk code. I suspect the person working on Windows was using a different matplotlib backend.

Turning off the "matplotlib event loop support" in Project Properties under the Extensions tab seems to work around it.

Or, adding the following seems to fix it for me when the "matplotlib event loop support" is turned on.

import matplotlib matplotlib.use('WXAgg')

This will only work if you have the WXAgg backend. Other backends supported by Wing IDE (in such a way that plots remain interactive even if the debug process is paused) are GTKAgg and Qt4Agg but I didn't try those yet.

I'll see if I can find and fix the bug. I suspect we need to disable our event loop support when the process ID changes. Thanks for reporting this.