Reading quoted/escaped arguments correctly from a string
I'm encountering an issue passing an argument to a command in a Bash script.
poc.sh:
#!/bin/bash
ARGS='"hi there" test'
./swap ${ARGS}
swap:
#!/bin/sh
echo "${2}" "${1}"
The current output is:
there" "hi
Changing only poc.sh (as I believe swap does what I want it to correctly), how do I get poc.sh to pass "hi there" and test as two arguments, with "hi there" having no quotes around it?
Solution 1:
A Few Introductory Words
If at all possible, don't use shell-quoted strings as an input format.
- It's hard to parse consistently: Different shells have different extensions, and different non-shell implementations implement different subsets (see the deltas between
shlex
andxargs
below). - It's hard to programmatically generate. ksh and bash have
printf '%q'
, which will generate a shell-quoted string with contents of an arbitrary variable, but no equivalent exists to this in the POSIX sh standard. - It's easy to parse badly. Many folks consuming this format use
eval
, which has substantial security concerns.
NUL-delimited streams are a far better practice, as they can accurately represent any possible shell array or argument list with no ambiguity whatsoever.
xargs, with bashisms
If you're getting your argument list from a human-generated input source using shell quoting, you might consider using xargs
to parse it. Consider:
array=( )
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(xargs printf '%s\0' <<<"$ARGS")
swap "${array[@]}"
...will put the parsed content of $ARGS
into the array array
. If you wanted to read from a file instead, substitute <filename
for <<<"$ARGS"
.
xargs, POSIX-compliant
If you're trying to write code compliant with POSIX sh, this gets trickier. (I'm going to assume file input here for reduced complexity):
# This does not work with entries containing literal newlines; you need bash for that.
run_with_args() {
while IFS= read -r entry; do
set -- "$@" "$entry"
done
"$@"
}
xargs printf '%s\n' <argfile | run_with_args ./swap
These approaches are safer than running xargs ./swap <argfile
inasmuch as it will throw an error if there are more or longer arguments than can be accommodated, rather than running excess arguments as separate commands.
Python shlex -- rather than xargs -- with bashisms
If you need more accurate POSIX sh parsing than xargs
implements, consider using the Python shlex
module instead:
shlex_split() {
python -c '
import shlex, sys
for item in shlex.split(sys.stdin.read()):
sys.stdout.write(item + "\0")
'
}
while IFS= read -r -d ''; do
array+=( "$REPLY" )
done < <(shlex_split <<<"$ARGS")
Solution 2:
Embedded quotes do not protect whitespace; they are treated literally. Use an array in bash
:
args=( "hi there" test)
./swap "${args[@]}"
In POSIX shell, you are stuck using eval
(which is why most shells support arrays).
args='"hi there" test'
eval "./swap $args"
As usual, be very sure you know the contents of $args
and understand how the resulting string will be parsed before using eval
.