How to parse and distinguish different and varying arguments of a user command with a regular expression?

I'm trying to interpret user commands as dash optional flag.

{run -o -f -a file1 file2}

Something like this:

/{run (-o|-f) (\w+) (.+?)}/g;

Is very limiting with only 1 choice of flag.

I'm looking for a regex that can properly parse the string with any amount of dash flags, spit out the flags into groups, and not worry about a set amount of whitespace in between.

string = "{run    -a  file1 file2}"
string = "{run    -a -o -f  file1 file2}"
string = "{run -f-a-o  file1 file2}"

string.match(regex) should output each flag and each file name.

Example output would be:

["f", "a", "o", "file1", "file2"]

Or if not possible, something like this:?

["-f-a-o", "file1", "file2"]

Solution 1:

A single regex is capable of capturing a maximum amount of 9 groups.

Thus ... "parse[ing a] string with any amount of dash flags" ... like the OP does demand can not be achieved by a single regex alone.

A good enough approach was to capture both groups, the flags sequence and the files sequence and then to process them into a concatenated list of separated flag and file name items ...

// see ... [https://regex101.com/r/VFjeK1/1]
const regXFlagsAndFiles =
  (/^\{\s*run(?:\s+-(?<flags>[a-z]+(?:\s*-[a-z]+)*))*\s+(?<files>[\w.]+(?:\s+[\w.]+)*)\s*\}$/);
  
function parseFlagAndFileList(value) {
  const {
    flags,
    files,
  } = regXFlagsAndFiles
    .exec(String(value))
    ?.groups || {};

  return (flags
    ?.split(/\s*-\s*/)
    ?? []
  ).concat(
    files
      ?.split(/\s+/)
      ?? []
  );
}

console.log([

  '{run file1 file2}',
  '{run    -a  file1 file2}',
  '{run    -a -o -f  file1 file2}',
  '{run -f-a-o  file1 file2}',

].map(parseFlagAndFileList));

console.log([

  '{run}',
  '{run      }',
  '{run file1  }',
  '{run    -abc  file1.foo file2.bar  }',
  '{run    -ab -ogg -fgg  file1.baz file2  }',
  '{run -f-a-ob  file1 file2.biz  }',
  '{ run -a -b       }',
  '{ fun -a -b       }',

].map(parseFlagAndFileList));
.as-console-wrapper { min-height: 100%!important; top: 0; }

A regex which almost covers the OP's wish of doing it all with just one simple pattern would look like this one ... /[\w.]+/g.

It of cause ...

  • covers no validation at all,
  • and needs to be supported by ...
    • String.prototype.match and
    • Array.prototype.slice.

// see ... [https://regex101.com/r/VFjeK1/3]
const regXCommandTokens = (/[\w.]+/g);

console.log([

  '{run file1 file2}',
  '{run    -a  file1 file2}',
  '{run    -a -o -f  file1 file2}',
  '{run -f-a-o  file1 file2}',

].map(command => command.match(regXCommandTokens).slice(1)));

console.log([

  '{run}',
  '{run      }',
  '{run file1  }',
  '{run    -abc  file1.foo file2.bar  }',
  '{run    -ab -ogg -fgg  file1.baz file2  }',
  '{run -f-a-ob  file1 file2.biz  }',
  '{ run -a -b       }',
  '{ fun -a -b       }',

].map(command => command.match(regXCommandTokens).slice(1)));
.as-console-wrapper { min-height: 100%!important; top: 0; }