What's the difference between () and [] in regular expression patterns?
Solution 1:
[]
denotes a character class. ()
denotes a capturing group.
[a-z0-9]
-- One character that is in the range of a-z
OR 0-9
(a-z0-9)
-- Explicit capture of a-z0-9
. No ranges.
a
-- Can be captured by [a-z0-9]
.
a-z0-9
-- Can be captured by (a-z0-9)
and then can be referenced in a replacement and/or later in the expression.
Solution 2:
(…)
is a group that groups the contents like in math; (a-z0-9)
is the grouped sequence of a-z0-9
. Groups are particularly used with quantifiers that allow the preceding expression to be repeated as a whole: a*b*
matches any number of a
’s followed by any number of b
’s, e.g. a
, aaab
, bbbbb
, etc.; in contrast to that, (ab)*
matches any number of ab
’s, e.g. ab
, abababab
, etc.
[…]
is a character class that describes the options for one single character; [a-z0-9]
describes one single character that can be of the range a
–z
or 0
–9
.