Multiprocessing example giving AttributeError
I am trying to implement multiprocessing in my code, and so, I thought that I would start my learning with some examples. I used the first example found in this documentation.
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
When I run the above code I get an AttributeError: can't get attribute 'f' on <module '__main__' (built-in)>
. I do not know why I am getting this error. I am also using Python 3.5 if that helps.
This problem seems to be a design feature of multiprocessing.Pool. See https://bugs.python.org/issue25053. For some reason Pool does not always work with objects not defined in an imported module. So you have to write your function into a different file and import the module.
File: defs.py
def f(x):
return x*x
File: run.py
from multiprocessing import Pool
import defs
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(defs.f, [1, 2, 3]))
If you use print or a different built-in function, the example should work. If this is not a bug (according to the link), the given example is chosen badly.
The multiprocessing
module has a major limitation when it comes to IPython use:
Functionality within this package requires that the
__main__
module be importable by the children. [...] This means that some examples, such as themultiprocessing.pool.Pool
examples will not work in the interactive interpreter. [from the documentation]
Fortunately, there is a fork of the multiprocessing
module called multiprocess
which uses dill instead of pickle to serialization and overcomes this issue conveniently.
Just install multiprocess
and replace multiprocessing
with multiprocess
in your imports:
import multiprocess as mp
def f(x):
return x*x
with mp.Pool(5) as pool:
print(pool.map(f, [1, 2, 3, 4, 5]))
Of course, externalizing the code as suggested in this answer works as well, but I find it very inconvenient: That is not why (and how) I use IPython environments.
<tl;dr> multiprocessing
does not work in IPython environments right away, use its fork multiprocess
instead.
This answer is for those who get this error on Windows 10 in 2021.
I've researched this error a bit since I got it myself. I get this error when running any examples from the official Python 3 documentation on multiprocessing.
Test environment:
- x86 Windows 10.0.19043.1165 + Python 3.9.2 - there is an error
- x86 Windows 10.0.19043.1165 + Python 3.9.6 - there is an error
- x86 Windows 10.0.19043.1110 + Python 3.9.6 - there is an error
- ARM Windows 10.0.21354.1 + Python 3.9.6 - no error (version from DEV branch)
- ARM macOS 11.5.2 + Python 3.9.6 - no errors
I have no way to test this situation in other conditions. But my guess is that the problem is on Windows as there is no such bug in the developer version "10.0.21354.1", but this ARM version probably has x86 emulation.
Also note that there was no such bug at the time Python 3.9.2 was released (February). Since all this time I was working on the same computer, I was surprised by the situation when the previously working code stopped working, and only the version for Windows changed.
I was unable to find a bug request with a similar situation in the Python bug tracker (I probably did a poor search). And the message marked "Correct answer" refers to a different situation. The problem is easy to reproduce, you can try to follow any example from the multiprocessing documentation on a freshly installed Windows 10 + Python 3.
Later, I will have the opportunity to check out Python 3.10 and the latest version of Windows 10. I am also interested in this situation in the context of Windows 11.
If you have information about this error (link to the bug tracker or something similar), be sure to share it.
At the moment I switched to Linux to continue working.