Subprocess timeout failure

I want to use a timeout on a subprocess

 from subprocess32 import check_output
 output = check_output("sleep 30", shell=True, timeout=1)

Unfortunately, whilst this raises a timeout error, it does so after 30 seconds. It seems that check_output cannot interrupt the shell command.

What can I do on on the Python side to stop this? I suspect that subprocess32 fails to kill the timed out process.


Solution 1:

check_output() with timeout is essentially:

with Popen(*popenargs, stdout=PIPE, **kwargs) as process:
    try:
        output, unused_err = process.communicate(inputdata, timeout=timeout)
    except TimeoutExpired:
        process.kill()
        output, unused_err = process.communicate()
        raise TimeoutExpired(process.args, timeout, output=output)

There are two issues:

  • [the second] .communicate() may wait for descendant processes, not just for the immediate child, see Python subprocess .check_call vs .check_output
  • process.kill() might not kill the whole process tree, see How to terminate a python subprocess launched with shell=True

It leads to the behaviour that you observed: the TimeoutExpired happens in a second, the shell is killed, but check_output() returns only in 30 seconds after the grandchild sleep process exits.

To workaround the issues, kill the whole process tree (all subprocesses that belong to the same group):

#!/usr/bin/env python3
import os
import signal
from subprocess import Popen, PIPE, TimeoutExpired
from time import monotonic as timer

start = timer()
with Popen('sleep 30', shell=True, stdout=PIPE, preexec_fn=os.setsid) as process:
    try:
        output = process.communicate(timeout=1)[0]
    except TimeoutExpired:
        os.killpg(process.pid, signal.SIGINT) # send signal to the process group
        output = process.communicate()[0]
print('Elapsed seconds: {:.2f}'.format(timer() - start))

Output

Elapsed seconds: 1.00

Solution 2:

Update for Python 3.6.

This is still happening but I have tested a lot of combinations of check_output, communicate and run methods and now I have a clear knowledge about where is the bug and how to avoid it in a easy way on Python 3.5 and Python 3.6.

My conclusion: It happens when you mix the use shell=True and any PIPE on stdout, stderr or stdin parameters (used in Popen and run methods).

Be careful: check_output uses PIPE inside. If you look at the code inside on Python 3.6 it is basically a call to run with stdout=PIPE: https://github.com/python/cpython/blob/ae011e00189d9083dd84c357718264e24fe77314/Lib/subprocess.py#L335

So, to solve @innisfree problem on Python 3.5 or 3.6 just do this:

check_output(['sleep', '30'], timeout=1)

And for other cases, just avoid mixing shell=True and PIPE, keeping in mind that check_output uses PIPE.