How do I use a regex in a shell script?
I am trying to match a string with a regex in a shell script. This string is a parameter of the script ( $1 ) and it is a date (MM/DD/YYYY) The regex I'm trying to use is :
^\d{2}[\/\-]\d{2}[\/\-]\d{4}$
It seems to work, I tried it on several regex tests websites.
My shell code is :
REGEX_DATE="^\d{2}[\/\-]\d{2}[\/\-]\d{4}$"
echo "$1" | grep -q $REGEX_DATE
echo $?
The "echo $?" returns 1 no matter is the string I'm putting in parameter.
Do you guys have an idea ?
Thanks !
To complement the existing helpful answers:
Using Bash's own regex-matching operator, =~
, is a faster alternative in this case, given that you're only matching a single value already stored in a variable:
set -- '12-34-5678' # set $1 to sample value
kREGEX_DATE='^[0-9]{2}[-/][0-9]{2}[-/][0-9]{4}$' # note use of [0-9] to avoid \d
[[ $1 =~ $kREGEX_DATE ]]
echo $? # 0 with the sample value, i.e., a successful match
Note, however, that the caveat re using flavor-specific regex constructs such as \d
equally applies:
While =~
supports EREs (extended regular expressions), it also supports the host platform's specific extension - it's a rare case of Bash's behavior being platform-dependent.
To remain portable (in the context of Bash), stick to the POSIX ERE specification.
Note that =~
even allows you to define capture groups (parenthesized subexpressions) whose matches you can later access through Bash's special ${BASH_REMATCH[@]}
array variable.
Further notes:
$kREGEX_DATE
is used unquoted, which is necessary for the regex to be recognized as such (quoted parts would be treated as literals).-
While not always necessary, it is advisable to store the regex in a variable first, because Bash has trouble with regex literals containing
\
.- E.g., on Linux, where
\<
is supported to match word boundaries,[[ 3 =~ \<3 ]] && echo yes
doesn't work, butre='\<3'; [[ 3 =~ $re ]] && echo yes
does.
- E.g., on Linux, where
I've changed variable name
REGEX_DATE
tokREGEX_DATE
(k
signaling a (conceptual) constant), so as to ensure that the name isn't an all-uppercase name, because all-uppercase variable names should be avoided to prevent conflicts with special environment and shell variables.
I think this is what you want:
REGEX_DATE='^\d{2}[/-]\d{2}[/-]\d{4}$'
echo "$1" | grep -P -q $REGEX_DATE
echo $?
I've used the -P switch to get perl regex.