How can I convert ereg expressions to preg in PHP?
Since POSIX regular expressions (ereg) are deprecated since PHP 5.3.0, I'd like to know an easy way to convert the old expressions to PCRE (Perl Compatible Regular Expressions) (preg).
Per example, I have this regular expression:
eregi('^hello world');
How can I translate expressions into preg_match
compatible expressions?
Note: This post serves as a placeholder for all posts related to conversion from ereg to preg, and as a duplicate options for related questions. Please do not close this question.
Related:
- How to change PHP's eregi to preg_match
- Changing ereg_replace to equivalent preg_replace
Solution 1:
The biggest change in the syntax is the addition of delimiters.
ereg('^hello', $str);
preg_match('/^hello/', $str);
Delimiters can be pretty much anything that is not alpha-numeric, a backslash or a whitespace character. The most used are generally ~
, /
and #
.
You can also use matching brackets:
preg_match('[^hello]', $str);
preg_match('(^hello)', $str);
preg_match('{^hello}', $str);
// etc
If your delimiter is found in the regular expression, you have to escape it:
ereg('^/hello', $str);
preg_match('/^\/hello/', $str);
You can easily escape all delimiters and reserved characters in a string by using preg_quote:
$expr = preg_quote('/hello', '/');
preg_match('/^'.$expr.'/', $str);
Also, PCRE supports modifiers for various things. One of the most used is the case-insensitive modifier i
, the alternative to eregi:
eregi('^hello', 'HELLO');
preg_match('/^hello/i', 'HELLO');
You can find the complete reference to PCRE syntax in PHP in the manual, as well as a list of differences between POSIX regex and PCRE to help converting the expression.
However, in your simple example you would not use a regular expression:
stripos($str, 'hello world') === 0
Solution 2:
Ereg replacement with preg(as of PHP 5.3.0) was right move in our favor.
preg_match, which uses a Perl-compatible regular expression syntax, is often a faster alternative to ereg.
You should know 4 main things to port ereg patterns to preg:
Add delimiters(/):
'pattern' => '/pattern/'
Escape delimiter if it is a part of the pattern:
'patt/ern' => '/patt\/ern/'
Achieve it programmatically in following way:$old_pattern = '<div>.+</div>';
$new_pattern = '/' . addcslashes($old_pattern, '/') . '/';
eregi(case-insensitive matching):
'pattern' => '/pattern/i'
So, if you are using eregi function for case insenstive matching, just add 'i' in the end of new pattern('/pattern/').ASCII values: In ereg, if you use number in the pattern, it is assumed that you are referring to the ASCII of a character. But in preg, number is not treated as ASCII value. So, if your pattern contain ASCII value in the ereg expression(for example: new line, tabs etc) then convert it to hexadecimal and prefix it with \x.
Example: 9(tab) becomes \x9 or alternatively use \t.
Solution 3:
From PHP version 5.3, ereg
is deprecated.
Moving from ereg
to preg_match
is just a small change in our pattern.
First, you have to add delimiters to your code, e.g.:
ereg('A-Z0-9a-z', 'string');
to
preg_match('/A-Z0-9a-z/', 'string');
For eregi
case-insensitive matching, put i
after the last delimiter, e.g.:
eregi('pattern', 'string');
to
preg_match ('/pattern/i', 'string');
Solution 4:
There are more differences between ereg()
and preg_replace()
than just the syntax:
-
Return value:
-
On error: both return
FALSE
-
On no match:
ereg()
returnsFALSE
,preg_match()
returns0
-
On match:
ereg()
returns string length or1
,preg_match()
returns always1
-
On error: both return
Resulting array of matched substrings: If some substring is not found at all (
(b)
in...a(b)?
), corresponding item inereg()
result will beFALSE
, while inpreg_match()
it will not be set at all.
If one is not brave enough to convert his or her ereg()
to preg_match()
, he or she may use mb_ereg(), which is still available in PHP 7.