Parallel: Import a python file from sibling folder
You are correct about what the issue is.
In your example, you modify sys.path
in main.py
in order to be able to import my_agent.my_worker
and my_utility.my_utils
.
However, this path change is not propagated to the worker processes, so if you were to run a remote function like
@ray.remote
def f():
# Print the PYTHONPATH on the worker process.
import sys
print(sys.path)
f.remote()
You would see that sys.path
on the worker does not include the parent directory that you added.
The reason that modifying sys.path
on the worker (e.g., in the MyWorker
constructor) doesn't work is that the MyWorker
class definition is pickled and shipped to the workers. Then the worker unpickles it, and the process of unpickling the class definition requires my_utils
to be imported, and this fails because the actor constructor hasn't had a chance to run yet.
There are a couple possible solutions here.
-
Run the script with something like
PYTHONPATH=$(dirname $(pwd)):$PYTHONPATH python main.py
(from within
working_dir/
). That should solve the issue because in this case the worker processes are forked from the scheduler process (which is forked from the main Python interpreter when you callray.init()
and so the environment variable will be inherited by the workers (this doesn't happen forsys.path
presumably because it is not an environment variable). -
It looks like adding the line
parent_dir = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) os.environ["PYTHONPATH"] = parent_dir + ":" + os.environ.get("PYTHONPATH", "")
in
main.py
(before theray.init()
call) also works for the same reason as above. Consider adding a
setup.py
and installing your project as a Python package so that it's automatically on the relevant path.
The new "Runtime Environments" feature, which didn't exist at the time of this post, should help with this issue: https://docs.ray.io/en/latest/handling-dependencies.html#runtime-environments. (See the working_dir
and py_modules
entries.)