Intercepting stdout of a subprocess while it is running

If this is my subprocess:

import time, sys
for i in range(200):
    sys.stdout.write( 'reading %i\n'%i )
    time.sleep(.02)

And this is the script controlling and modifying the output of the subprocess:

import subprocess, time, sys

print 'starting'
    
proc = subprocess.Popen(
    'c:/test_apps/testcr.py',
    shell=True,
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE  )

print 'process created'

while True:
    #next_line = proc.communicate()[0]
    next_line = proc.stdout.readline()
    if next_line == '' and proc.poll() != None:
        break
    sys.stdout.write(next_line)
    sys.stdout.flush()
    
print 'done'

Why is readline and communicate waiting until the process is done running? Is there a simple way to pass (and modify) the subprocess' stdout real-time?

I'm on Windows XP.


Solution 1:

As Charles already mentioned, the problem is buffering. I ran in to a similar problem when writing some modules for SNMPd, and solved it by replacing stdout with an auto-flushing version.

I used the following code, inspired by some posts on ActiveState:

class FlushFile(object):
    """Write-only flushing wrapper for file-type objects."""
    def __init__(self, f):
        self.f = f
    def write(self, x):
        self.f.write(x)
        self.f.flush()

# Replace stdout with an automatically flushing version
sys.stdout = FlushFile(sys.__stdout__)

Solution 2:

Process output is buffered. On more UNIXy operating systems (or Cygwin), the pexpect module is available, which recites all the necessary incantations to avoid buffering-related issues. However, these incantations require a working pty module, which is not available on native (non-cygwin) win32 Python builds.

In the example case where you control the subprocess, you can just have it call sys.stdout.flush() where necessary -- but for arbitrary subprocesses, that option isn't available.

See also the question "Why not just use a pipe (popen())?" in the pexpect FAQ.