multiprocessing: How can I ʀᴇʟɪᴀʙʟʏ redirect stdout from a child process?
NB. I have seen Log output of multiprocessing.Process - unfortunately, it doesn't answer this question.
I am creating a child process (on windows) via multiprocessing. I want all of the child process's stdout and stderr output to be redirected to a log file, rather than appearing at the console. The only suggestion I have seen is for the child process to set sys.stdout to a file. However, this does not effectively redirect all stdout output, due to the behaviour of stdout redirection on Windows.
To illustrate the problem, build a Windows DLL with the following code
#include <iostream>
extern "C"
{
__declspec(dllexport) void writeToStdOut()
{
std::cout << "Writing to STDOUT from test DLL" << std::endl;
}
}
Then create and run a python script like the following, which imports this DLL and calls the function:
from ctypes import *
import sys
print
print "Writing to STDOUT from python, before redirect"
print
sys.stdout = open("stdout_redirect_log.txt", "w")
print "Writing to STDOUT from python, after redirect"
testdll = CDLL("Release/stdout_test.dll")
testdll.writeToStdOut()
In order to see the same behaviour as me, it is probably necessary for the DLL to be built against a different C runtime than than the one Python uses. In my case, python is built with Visual Studio 2010, but my DLL is built with VS 2005.
The behaviour I see is that the console shows:
> stdout_test.py
Writing to STDOUT from python, before redirect
Writing to STDOUT from test DLL
While the file stdout_redirect_log.txt ends up containing:
Writing to STDOUT from python, after redirect
In other words, setting sys.stdout failed to redirect the stdout output generated by the DLL. This is unsurprising given the nature of the underlying APIs for stdout redirection in Windows. I have encountered this problem at the native/C++ level before and never found a way to reliably redirect stdout from within a process. It has to be done externally.
This is actually the very reason I am launching a child process - it's so that I can connect externally to its pipes and thus guarantee that I am intercepting all of its output. I can definitely do this by launching the process manually with pywin32, but I would very much like to be able to use the facilities of multiprocessing, in particular the ability to communicate with the child process via a multiprocessing Pipe object, in order to get progress updates. The question is whether there is any way to both use multiprocessing for its IPC facilities and to reliably redirect all of the child's stdout and stderr output to a file.
UPDATE: Looking at the source code for multiprocessing.Processs, it has a static member, _Popen, which looks like it can be used to override the class used to create the process. If it's set to None (default), it uses a multiprocessing.forking._Popen, but it looks like by saying
multiprocessing.Process._Popen = MyPopenClass
I could override the process creation. However, although I could derive this from multiprocessing.forking._Popen, it looks like I would have to copy a bunch of internal stuff into my implementation, which sounds flaky and not very future-proof. If that's the only choice I think I'd probably plump for doing the whole thing manually with pywin32 instead.
Solution 1:
The solution you suggest is a good one: create your processes manually such that you have explicit access to their stdout/stderr file handles. You can then create a socket to communicate with the sub-process and use multiprocessing.connection over that socket (multiprocessing.Pipe creates the same type of connection object, so this should give you all the same IPC functionality).
Here's a two-file example.
master.py:
import multiprocessing.connection
import subprocess
import socket
import sys, os
## Listen for connection from remote process (and find free port number)
port = 10000
while True:
try:
l = multiprocessing.connection.Listener(('localhost', int(port)), authkey="secret")
break
except socket.error as ex:
if ex.errno != 98:
raise
port += 1 ## if errno==98, then port is not available.
proc = subprocess.Popen((sys.executable, "subproc.py", str(port)), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## open connection for remote process
conn = l.accept()
conn.send([1, "asd", None])
print(proc.stdout.readline())
subproc.py:
import multiprocessing.connection
import subprocess
import sys, os, time
port = int(sys.argv[1])
conn = multiprocessing.connection.Client(('localhost', port), authkey="secret")
while True:
try:
obj = conn.recv()
print("received: %s\n" % str(obj))
sys.stdout.flush()
except EOFError: ## connection closed
break
You may also want to see the first answer to this question to get non-blocking reads from the subprocess.
Solution 2:
I don't think you have a better option than redirecting a subprocess to a file as you mentioned in your comment.
The way consoles stdin/out/err work in windows is each process when it's born has its std handles defined. You can change them with SetStdHandle. When you modify python's sys.stdout
you only modify where python prints out stuff, not where other DLL's are printing stuff. Part of the CRT in your DLL is using GetStdHandle to find out where to print out to. If you want, you can do whatever piping you want in windows API in your DLL or in your python script with pywin32. Though I do think it'll be simpler with subprocess.