Understanding python subprocess.check_output's first argument and shell=True [duplicate]

I'm confused on how to correctly use Python's subprocess module, specifically, the check_output method's first argument and the shell option. Check out the output from the interactive prompt below. I pass the first argument as a list and depending on whether shell=True is set, I get different output. Can someone explain why this is and the output that is outputted?

>>> import subprocess
>>> subprocess.check_output(["echo", "Hello World!"])
'Hello World!\n'
>>> subprocess.check_output(["echo", "Hello World!"], shell=True)
'\n'

Now when I pass the first argument as a simple string instead of a list, I get this nasty stack trace. Why is that and what's going on here?

>>> subprocess.check_output("echo Hello World!")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 537, in check_output
process = Popen(stdout=PIPE, *popenargs, **kwargs)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1228, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

However, when I turn on shell=True, it then works perfectly:

>>> subprocess.check_output("echo Hello World!", shell=True)
'Hello World!\n'

So I'm a little confused, it works when the first arg is in a list WITHOUT shell=True and then works as a simple string WITH shell=True. I'm not understanding what shell=True does and the difference between passing the first arg as a list vs a string.


From the documentation of Popen:

The shell argument (which defaults to False) specifies whether to use the shell as the program to execute. If shell is True, it is recommended to pass args as a string rather than as a sequence.

On Unix with shell=True, the shell defaults to /bin/sh. If args is a string, the string specifies the command to execute through the shell. This means that the string must be formatted exactly as it would be when typed at the shell prompt. This includes, for example, quoting or backslash escaping filenames with spaces in them. If args is a sequence, the first item specifies the command string, and any additional items will be treated as additional arguments to the shell itself. That is to say, Popen does the equivalent of:

Popen(['/bin/sh', '-c', args[0], args[1], ...])

On Windows with shell=True, the COMSPEC environment variable specifies the default shell. The only time you need to specify shell=True on Windows is when the command you wish to execute is built into the shell (e.g. dir or copy). You do not need shell=True to run a batch file or console-based executable.

In your case, since echo is a shell built-in when launched with shell=True, if you want to pass arguments to the echo command you must either write the command as a string or pass a sequence that has the whole command as a string as first element. Other elements of the sequence are passed as arguments to the shell, not to the command you are invoking.

In some OSes echo is also a program (usually /bin/echo). This explains why your first example didn't raise an exception but outputs '\n' instead of the expected 'Hello, World!\n': the /bin/echo command was executed without arguments, because the argument was "consumed" by the shell.

The error raised when calling:

subprocess.check_output("echo Hello World!")

is due to the fact that, since you did not use shell=True, python is trying to execute the program echo Hello World! i.e. a program that has the name echo<space>Hello<space>World!. It is a valid program name, but you there's no program with that name, hence the exception.