Why are these strings matching my regular expression?

I am still new to regular expressions and I am trying to write one that matches correct combinations of familial relationships such as great great grandmother (abbreviated GGgm) great = G step = S father, mother, son, etc. = f, m, s, d, gm, gf

Legal: Should Match : m, gf, Ggm, GGgf, Ss, SGgs

Illegal: Should Not Match: mf, Gm, SSm, GSm

my current pattern is:

^((m|f|d|s|gf|gm|gs|gd))|^(\S(m|f|d|s|gf|gm|gs|gd))|^(\S(G)*(gm|gs|gd|gf))|^((G)*(gf|gm|gs|gd))$

however this is matching bad combinations such as mf and Gm. How can I fix this?


Solution 1:

In your regex, \S(m|f|d|s|gf|gm|gs|gd) matches any non-whitespace char (with \S) and then m, f, d, etc. You probably wanted to match S with \S, but that is not what \S does.

You can use

^S?(g?|G*g)[mfds]$

See the regex demo. Details:

  • ^- start of string
  • S? - an optional S
  • (g?|G*g) - an optional g or zero or more G and then a g char
  • [mfds] - m, f, d or s
  • $ - end of string.