What is the second sh in `sh -c 'some shell code' sh`?
Question
I encountered the following snippet:
sh -c 'some shell code' sh …
(where …
denotes zero or more additional arguments).
I know the first sh
is the command. I know sh -c
is supposed to execute the provided shell code (i.e. some shell code
). What is the purpose of the second sh
?
Disclaimer
Similar or related question sometimes appears as a follow-up question after sh -c
is properly used in an answer and the asker (or another user) wants to know in detail how the answer works. Or it may be a part of a bigger question of the type "what does this code do?". The purpose of the current question is to provide a canonical answer below.
The main question, similar or related questions covered here are:
- What is the second
sh
insh -c 'some shell code' sh …
? - What is the second
bash
inbash -c 'some shell code' bash …
? - What is
find-sh
infind . -exec sh -c 'some shell code' find-sh {} \;
? - If
some shell code
was in a shell script and we called./myscript foo …
, thenfoo
would be referred to as$1
inside the script. Butsh -c 'some shell code' foo …
(orbash -c …
) refers tofoo
as$0
. Why the discrepancy? -
What is wrong with using
sh -c 'some shell code' foo …
wherefoo
is a "random" argument? In particular:sh -c 'some shell code' "$variable"
sh -c 'some shell code' "$@"
find . -exec sh -c 'some shell code' {} \;
find . -exec sh -c 'some shell code' {} +
I mean I can use
$0
instead of$1
insidesome shell code
, it doesn't bother me. What bad can happen?
Some of the above may be considered duplicates (possibly cross-site duplicates) of existing questions (e.g. this one). Still I haven't found a question/answer that aims at explaining the issue to beginners who want to understand sh -c …
and its allegedly useless extra argument observed in high quality answers. This question fills the gap.
Solution 1:
Preliminary note
It's quite uncommon to see sh -c 'some shell code'
invoked directly from a shell. In practice, if you're in a shell then you will probably choose to use the same shell (or its subshell) to execute some shell code
. It's quite common to see sh -c
invoked from within another tool like find -exec
.
Nevertheless most of this answer elaborates on sh -c
represented as a standalone command (which it is) because the main issue depends solely on sh
. Later on few examples and hints use find
where it seems useful and/or educative.
Basic answer
What is the second
sh
insh -c 'some shell code' sh …
?
It's an arbitrary string. Its purpose is to provide a meaningful name to use in warning and error messages. Here it's sh
but it might be foo
, shell1
or special purpose shell
(properly quoted to include spaces).
Bash and other POSIX-compliant shells work in the same way when it comes to -c
. While I find POSIX documentation to be too formal to cite here, an excerpt from man 1 bash
is quite straightforward:
bash [options] [command_string | file]
-c
If the-c
option is present, then commands are read from the first non-option argumentcommand_string
. If there are arguments after thecommand_string
, the first argument is assigned to$0
and any remaining arguments are assigned to the positional parameters. The assignment to$0
sets the name of the shell, which is used in warning and error messages.
In our case some shell code
is the command_string
and this second sh
is "the first argument after". It is assigned to $0
in the context of some shell code
.
This way the error from sh -c 'nonexistent-command' "special purpose shell"
is:
special purpose shell: nonexistent-command: command not found
and you immediately know which shell it comes from. This is useful if you have many sh -c
invocations. "The first argument after the command_string
" may not be supplied at all; in such case sh
(string) will be assigned to $0
if the shell is sh
, bash
if the shell is bash
. Therefore these are equivalent:
sh -c 'some shell code' sh
sh -c 'some shell code'
But if you need to pass at least one argument after some shell code
(i.e. maybe arguments that should be assigned to $1
, $2
, …) then there is no way to omit the one that will be assigned to $0
.
Discrepancy?
If
some shell code
was in a shell script and we called./myscript foo …
, thenfoo
would be referred to as$1
inside the script. Butsh -c 'some shell code' foo …
(orbash -c …
) refers tofoo
as$0
. Why the discrepancy?
A shell interpreting a script assigns the name of the script (e.g. ./myscript
) to $0
. Then the name will be used in warning and error messages. Usually this behavior is perfectly OK and there is no need to provide $0
manually. On the other hand with sh -c
there is no script to get a name from. Still some meaningful name is useful, hence the way to provide it.
The discrepancy will vanish if you stop considering the first argument after some shell code
as a (kind of) positional parameter for the code. If some shell code
is in a script named myscript
and you call ./myscript foo …
, then the equivalent code with sh -c
will be:
sh -c 'some shell code' ./myscript foo …
Here ./myscript
is just a string, it looks like a path but this path may not exist; the string may be different in the first place. This way the same shell code can be used. The shell will assign foo
to $1
in both cases. No discrepancy.
Pitfalls of treating $0
like $1
What is wrong in using
sh -c 'some shell code' foo …
where foo is a "random" argument? […] I mean I can use$0
instead of$1
insidesome shell code
, it doesn't bother me. What bad can happen?
In many cases this will work. There are arguments against this approach though.
-
The most obvious pitfall is you may get misleading warnings or errors from the invoked shell. Remember they will start with whatever
$0
expands to in the context of the shell. Consider this snippet:sh -c 'eecho "$0"' foo # typo intended
The error is:
foo: eecho: command not found
and you may wonder if
foo
was treated as a command. It is not that bad iffoo
is hardcoded and unique; at least you know the error has something to do withfoo
, so it brings your attention to this very line of code. It can be worse:# as regular user sh -c 'ls "$0" > "$1"/' "$HOME" "/root/foo"
The output:
/home/kamil: /root/foo: Permission denied
The first reaction is: what happened to my home directory? Another example:
find /etc/fs* -exec sh -c '<<EOF' {} \; # insane shell code intended
Possible output:
/etc/fstab: warning: here-document at line 0 delimited by end-of-file (wanted `EOF')
It's very easy to think there's something wrong with
/etc/fstab
; or to wonder why the code wants to interpret it as here-document.Now run these commands and see how accurate the errors are when we provide meaningful names:
sh -c 'eecho "$1"' "shell with echo" foo # typo intended sh -c 'ls "$1" > "$2"/' my-special-shell "$HOME" "/root/foo" find /etc/fs* -exec sh -c '<<EOF' find-sh {} \; # insane shell code intended
-
some shell code
is not identical to what it would be in a script. This is directly related to the alleged discrepancy elaborated above. It may not be a problem at all; still at some level of shell-fu you may appreciate consistency. -
Similarly at some level you may find yourself enjoying scripting in the Right Way. Then even if you can get away with using
$0
, you won't do this because this is not how things are supposed to work. -
If you want to pass more than one argument or if the number of arguments is not known in advance and you need to process them in sequence, then using
$0
for one of them is a bad idea.$0
is by design different than$1
or$2
. This fact will manifest itself ifsome shell code
uses one or more of the following:-
$#
– The number of positional parameters does not take$0
into account because$0
is not a positional parameter. -
$@
or$*
–"$@"
is like"$1", "$2", …
, there is no"$0"
in this sequence. -
for f do
(equivalent tofor f in "$@"; do
) –$0
is never assigned to$f
. -
shift
(shift [n]
in general) – Positional parameters are shifted,$0
stays intact.
In particular consider this scenario:
-
You start with code like this:
find . -exec sh -c 'some shell code referring "$1"' find-sh {} \;
-
You notice it runs one
sh
per file. This is suboptimal. -
You know
-exec … \;
replaces{}
with one filename but-exec … {} +
replaces{}
with possibly many filenames. You take advantage of the latter and introduce a loop:find . -exec sh -c ' for f do some shell code referring "$f" done ' find-sh {} +
Such optimization is a good thing. But if you start with this:
# not exactly right but you will get away with this find . -exec sh -c 'some shell code referring "$0"' {} \;
and turn it into this:
# flawed find . -exec sh -c ' for f do some shell code referring "$f" done ' {} +
then you will introduce a bug: the very first file coming from the expansion of
{}
won't be processed bysome shell code referring "$f"
. Note-exec sh -c … {} +
runssh
with as many arguments as it can, but there are limits for this and if there are many, many files then onesh
will not be enough, anothersh
process will be spawned byfind
(and possibly another, and another, …). With eachsh
you will skip (i.e. not process) one file.To test this in practice replace the string
some shell code referring
withecho
and run the resulting code snippets in a directory with few files. The last snippet won't print.
.All this doesn't mean you shouldn't use
$0
insome shell code
at all. You can and shall use$0
for things it was designed for. E.g. if you wantsome shell code
to print a (custom) warning or error then make the message start with$0
. Provide a meaningful name aftersome shell code
and enjoy meaningful errors (if any) instead of vague or misleading ones. -
Final hints:
-
With
find … -exec sh -c …
never embed{}
in the shell code. -
For the same reasons
some shell code
should not contain fragments expanded by the current shell, unless you really really know the expanded values are safe for sure. The best practice is to single-quote the entire code (like in the examples above, it's always'some shell code'
) and pass every non-fixed value as a separate argument. Such argument is safely retrievable from a positional parameter within the inner shell. Exporting variables is also safe. Run this and analyze output of eachsh -c …
(desired output isfoo';date'
):variable="foo';date'" # wrong sh -c "echo '$variable'" my-sh # right sh -c 'echo "$1"' my-sh "$variable" # also right export variable sh -c 'echo "$variable"' my-sh
-
If you run
sh -c 'some shell code' …
in a shell, the shell will remove single-quotes embracingsome shell code
; then the inner shell (sh
) will parsesome shell code
. It's important to quote right also in this context. You may find this useful: Parameter expansion and quotes within quotes.