Can you make a python subprocess output stdout and stderr as usual, but also capture the output as a string? [duplicate]

Possible Duplicate:
Wrap subprocess' stdout/stderr

In this question, hanan-n asked whether it was possible to have a python subprocess that outputs to stdout while also keeping the output in a string for later processing. The solution in this case was to loop over every output line and print them manually:

output = []
p = subprocess.Popen(["the", "command"], stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, ''):
    print(line)
    output.append(line)

However, this solution doesn't generalise to the case where you want to do this for both stdout and stderr, while satisfying the following:

  • the output from stdout/stderr should go to the parent process' stdout/stderr respectively
  • the output should be done in real time as much as possible (but I only need access to the strings at the end)
  • the order between stdout and stderr lines should not be changed (I'm not quite sure how that would even work if the subprocess flushes its stdout and stderr caches at different intervals; let's assume for now that we get everything in nice chunks that contain full lines?)

I looked through the subprocess documentation, but couldn't find anything that can achieve this. The closest I could find is to add stderr=subprocess.stdout and use the same solution as above, but then we lose the distinction between regular output and errors. Any ideas? I'm guessing the solution - if there is one - will involve having asynchronous reads to p.stdout and p.stderr.

Here is an example of what I would like to do:

p = subprocess.Popen(["the", "command"])
p.wait()  # while p runs, the command's stdout and stderr should behave as usual
p_stdout = p.stdout.read()  # unfortunately, this will return '' unless you use subprocess.PIPE
p_stderr = p.stderr.read()  # ditto
[do something with p_stdout and p_stderr]

Solution 1:

This example seems to work for me:

# -*- Mode: Python -*-
# vi:si:et:sw=4:sts=4:ts=4

import subprocess
import sys
import select

p = subprocess.Popen(["find", "/proc"],
    stdout=subprocess.PIPE, stderr=subprocess.PIPE)

stdout = []
stderr = []

while True:
    reads = [p.stdout.fileno(), p.stderr.fileno()]
    ret = select.select(reads, [], [])

    for fd in ret[0]:
        if fd == p.stdout.fileno():
            read = p.stdout.readline()
            sys.stdout.write('stdout: ' + read)
            stdout.append(read)
        if fd == p.stderr.fileno():
            read = p.stderr.readline()
            sys.stderr.write('stderr: ' + read)
            stderr.append(read)

    if p.poll() != None:
        break

print 'program ended'

print 'stdout:', "".join(stdout)
print 'stderr:', "".join(stderr)

In general, any situation where you want to do stuff with multiple file descriptors at the same time and you don't know which one will have stuff for you to read, you should use select or something equivalent (like a Twisted reactor).

Solution 2:

To print to console and capture in a string stdout/stderr of a subprocess in a portable manner:

from StringIO import StringIO

fout, ferr = StringIO(), StringIO()
exitcode = teed_call(["the", "command"], stdout=fout, stderr=ferr)
stdout = fout.getvalue()
stderr = ferr.getvalue()

where teed_call() is defined in Python subprocess get children's output to file and terminal?

You could use any file-like objects (.write() method).

Solution 3:

Create two readers as above, one for stdout one for stderr and start each in a new thread. This would append to the list in roughly the same order they were output by the process. Maintain two separate lists if you want.

i.e.,

p = subprocess.Popen(["the", "command"])
t1 = thread.start_new_thread(func,stdout)  # create a function with the readers
t2 = thread.start_new_thread(func,stderr)
p.wait() 
# your logic here