clear meaning of .* in regex
.
means match any character in regular expressions.
*
means zero or more occurrences of the SINGLE regex preceding it.
My alphabet.txt
contains a line
abcdefghijklmnopqrstuvwxyz
Doesn't grep a.*z alphabet.txt
mean match any substrings that start with a
, with zero or more occurrences of any type of SINGLE character in between them, and end with z
? For example, abz
, abbz
, ahhhhhz
, but not abbdz
?
I thought grep a.*z alphabet.txt
wouldn't catch the line in my alphabet file.
*
means that the immediately-preceding pattern is repeated, not that the matched text is repeated. For example, [ab]*
means (|[ab]|[ab][ab]|[ab][ab][ab]|…)
The pattern [ab]
is repeated zero or more times. It will match "aba"
because that properly fulfills the pattern [ab][ab][ab]
.
With .*
, it becomes (|.|..|...|....|…)
, so it matches any number of characters, and the characters can differ.