Using getopts to process long and short command line options

Solution 1:

getopt and getopts are different beasts, and people seem to have a bit of misunderstanding of what they do. getopts is a built-in command to bash to process command-line options in a loop and assign each found option and value in turn to built-in variables, so you can further process them. getopt, however, is an external utility program, and it doesn't actually process your options for you the way that e.g. bash getopts, the Perl Getopt module or the Python optparse/argparse modules do. All that getopt does is canonicalize the options that are passed in — i.e. convert them to a more standard form, so that it's easier for a shell script to process them. For example, an application of getopt might convert the following:

myscript -ab infile.txt -ooutfile.txt

into this:

myscript -a -b -o outfile.txt infile.txt

You have to do the actual processing yourself. You don't have to use getopt at all if you make various restrictions on the way you can specify options:

  • only put one option per argument;
  • all options go before any positional parameters (i.e. non-option arguments);
  • for options with values (e.g. -o above), the value has to go as a separate argument (after a space).

Why use getopt instead of getopts? The basic reason is that only GNU getopt gives you support for long-named command-line options.1 (GNU getopt is the default on Linux. Mac OS X and FreeBSD come with a basic and not-very-useful getopt, but the GNU version can be installed; see below.)

For example, here's an example of using GNU getopt, from a script of mine called javawrap:

# NOTE: This requires GNU getopt.  On Mac OS X and FreeBSD, you have to install this
# separately; see below.
TEMP=$(getopt -o vdm: --long verbose,debug,memory:,debugfile:,minheap:,maxheap: \
              -n 'javawrap' -- "$@")

if [ $? != 0 ] ; then echo "Terminating..." >&2 ; exit 1 ; fi

# Note the quotes around '$TEMP': they are essential!
eval set -- "$TEMP"

VERBOSE=false
DEBUG=false
MEMORY=
DEBUGFILE=
JAVA_MISC_OPT=
while true; do
  case "$1" in
    -v | --verbose ) VERBOSE=true; shift ;;
    -d | --debug ) DEBUG=true; shift ;;
    -m | --memory ) MEMORY="$2"; shift 2 ;;
    --debugfile ) DEBUGFILE="$2"; shift 2 ;;
    --minheap )
      JAVA_MISC_OPT="$JAVA_MISC_OPT -XX:MinHeapFreeRatio=$2"; shift 2 ;;
    --maxheap )
      JAVA_MISC_OPT="$JAVA_MISC_OPT -XX:MaxHeapFreeRatio=$2"; shift 2 ;;
    -- ) shift; break ;;
    * ) break ;;
  esac
done

This lets you specify options like --verbose -dm4096 --minh=20 --maxhe 40 --debugfi="/Users/John Johnson/debug.txt" or similar. The effect of the call to getopt is to canonicalize the options to --verbose -d -m 4096 --minheap 20 --maxheap 40 --debugfile "/Users/John Johnson/debug.txt" so that you can more easily process them. The quoting around "$1" and "$2" is important as it ensures that arguments with spaces in them get handled properly.

If you delete the first 9 lines (everything up through the eval set line), the code will still work! However, your code will be much pickier in what sorts of options it accepts: In particular, you'll have to specify all options in the "canonical" form described above. With the use of getopt, however, you can group single-letter options, use shorter non-ambiguous forms of long-options, use either the --file foo.txt or --file=foo.txt style, use either the -m 4096 or -m4096 style, mix options and non-options in any order, etc. getopt also outputs an error message if unrecognized or ambiguous options are found.

NOTE: There are actually two totally different versions of getopt, basic getopt and GNU getopt, with different features and different calling conventions.2 Basic getopt is quite broken: Not only does it not handle long options, it also can't even handle embedded spaces inside of arguments or empty arguments, whereas getopts does do this right. The above code will not work in basic getopt. GNU getopt is installed by default on Linux, but on Mac OS X and FreeBSD it needs to be installed separately. On Mac OS X, install MacPorts (http://www.macports.org) and then do sudo port install getopt to install GNU getopt (usually into /opt/local/bin), and make sure that /opt/local/bin is in your shell path ahead of /usr/bin. On FreeBSD, install misc/getopt.

A quick guide to modifying the example code for your own program: Of the first few lines, all is "boilerplate" that should stay the same, except the line that calls getopt. You should change the program name after -n, specify short options after -o, and long options after --long. Put a colon after options that take a value.

Finally, if you see code that has just set instead of eval set, it was written for BSD getopt. You should change it to use the eval set style, which works fine with both versions of getopt, while the plain set doesn't work right with GNU getopt.

1Actually, getopts in ksh93 supports long-named options, but this shell isn't used as often as bash. In zsh, use zparseopts to get this functionality.

2Technically, "GNU getopt" is a misnomer; this version was actually written for Linux rather than the GNU project. However, it follows all the GNU conventions, and the term "GNU getopt" is commonly used (e.g. on FreeBSD).

Solution 2:

There are three implementations that may be considered:

  • Bash builtin getopts. This does not support long option names with the double-dash prefix. It only supports single-character options.

  • BSD UNIX implementation of standalone getopt command (which is what MacOS uses). This does not support long options either.

  • GNU implementation of standalone getopt. GNU getopt(3) (used by the command-line getopt(1) on Linux) supports parsing long options.


Some other answers show a solution for using the bash builtin getopts to mimic long options. That solution actually makes a short option whose character is "-". So you get "--" as the flag. Then anything following that becomes OPTARG, and you test the OPTARG with a nested case.

This is clever, but it comes with caveats:

  • getopts can't enforce the opt spec. It can't return errors if the user supplies an invalid option. You have to do your own error-checking as you parse OPTARG.
  • OPTARG is used for the long option name, which complicates usage when your long option itself has an argument. You end up having to code that yourself as an additional case.

So while it is possible to write more code to work around the lack of support for long options, this is a lot more work and partially defeats the purpose of using a getopt parser to simplify your code.

Solution 3:

The Bash builtin getopts function can be used to parse long options by putting a dash character followed by a colon into the optspec:

#!/usr/bin/env bash 
optspec=":hv-:"
while getopts "$optspec" optchar; do
    case "${optchar}" in
        -)
            case "${OPTARG}" in
                loglevel)
                    val="${!OPTIND}"; OPTIND=$(( $OPTIND + 1 ))
                    echo "Parsing option: '--${OPTARG}', value: '${val}'" >&2;
                    ;;
                loglevel=*)
                    val=${OPTARG#*=}
                    opt=${OPTARG%=$val}
                    echo "Parsing option: '--${opt}', value: '${val}'" >&2
                    ;;
                *)
                    if [ "$OPTERR" = 1 ] && [ "${optspec:0:1}" != ":" ]; then
                        echo "Unknown option --${OPTARG}" >&2
                    fi
                    ;;
            esac;;
        h)
            echo "usage: $0 [-v] [--loglevel[=]<value>]" >&2
            exit 2
            ;;
        v)
            echo "Parsing option: '-${optchar}'" >&2
            ;;
        *)
            if [ "$OPTERR" != 1 ] || [ "${optspec:0:1}" = ":" ]; then
                echo "Non-option argument: '-${OPTARG}'" >&2
            fi
            ;;
    esac
done

After copying to executable file name=getopts_test.sh in the current working directory, one can produce output like

$ ./getopts_test.sh
$ ./getopts_test.sh -f
Non-option argument: '-f'
$ ./getopts_test.sh -h
usage: code/getopts_test.sh [-v] [--loglevel[=]<value>]
$ ./getopts_test.sh --help
$ ./getopts_test.sh -v
Parsing option: '-v'
$ ./getopts_test.sh --very-bad
$ ./getopts_test.sh --loglevel
Parsing option: '--loglevel', value: ''
$ ./getopts_test.sh --loglevel 11
Parsing option: '--loglevel', value: '11'
$ ./getopts_test.sh --loglevel=11
Parsing option: '--loglevel', value: '11'

Obviously getopts neither performs OPTERR checking nor option-argument parsing for the long options. The script fragment above shows how this may be done manually. The basic principle also works in the Debian Almquist shell ("dash"). Note the special case:

getopts -- "-:"  ## without the option terminator "-- " bash complains about "-:"
getopts "-:"     ## this works in the Debian Almquist shell ("dash")

Note that, as GreyCat from over at http://mywiki.wooledge.org/BashFAQ points out, this trick exploits a non-standard behaviour of the shell which permits the option-argument (i.e. the filename in "-f filename") to be concatenated to the option (as in "-ffilename"). The POSIX standard says there must be a space between them, which in the case of "-- longoption" would terminate the option-parsing and turn all longoptions into non-option arguments.

Solution 4:

The built-in getopts command is still, AFAIK, limited to single-character options only.

There is (or used to be) an external program getopt that would reorganize a set of options such that it was easier to parse. You could adapt that design to handle long options too. Example usage:

aflag=no
bflag=no
flist=""
set -- $(getopt abf: "$@")
while [ $# -gt 0 ]
do
    case "$1" in
    (-a) aflag=yes;;
    (-b) bflag=yes;;
    (-f) flist="$flist $2"; shift;;
    (--) shift; break;;
    (-*) echo "$0: error - unrecognized option $1" 1>&2; exit 1;;
    (*)  break;;
    esac
    shift
done

# Process remaining non-option arguments
...

You could use a similar scheme with a getoptlong command.

Note that the fundamental weakness with the external getopt program is the difficulty of handling arguments with spaces in them, and in preserving those spaces accurately. This is why the built-in getopts is superior, albeit limited by the fact it only handles single-letter options.