Can someone explain the /e regex modifier? [duplicate]
Solution 1:
The e
Regex Modifier in PHP with example vulnerability & alternatives
What e
does, with an example...
The e
modifier is a deprecated regex modifier which allows you to use PHP code within your regular expression. This means that whatever you parse in will be evaluated as a part of your program.
For example, we can use something like this:
$input = "Bet you want a BMW.";
echo preg_replace("/([a-z]*)/e", "strtoupper('\\1')", $input);
This will output BET YOU WANT A BMW.
Without the e
modifier, we get this very different output:
strtoupper('')Bstrtoupper('et')strtoupper('') strtoupper('you')strtoupper('') strtoupper('want')strtoupper('') strtoupper('a')strtoupper('') strtoupper('')Bstrtoupper('')Mstrtoupper('')Wstrtoupper('').strtoupper('')
Potential security issues with e
...
The e
modifier is deprecated for security reasons. Here's an example of an issue you can run into very easily with e
:
$password = 'secret';
...
$input = $_GET['input'];
echo preg_replace('|^(.*)$|e', '"\1"', $input);
If I submit my input as "$password"
, the output to this function will be secret
. It's very easy, therefore, for me to access session variables, all variables being used on the back-end and even take deeper levels of control over your application (eval('cat /etc/passwd');
?) through this simple piece of poorly written code.
Like the similarly deprecated mysql
libraries, this doesn't mean that you cannot write code which is not subject to vulnerability using e
, just that it's more difficult to do so.
What you should use instead...
You should use preg_replace_callback in nearly all places you would consider using the e
modifier. The code is definitely not as brief in this case but don't let that fool you -- it's twice as fast:
$input = "Bet you want a BMW.";
echo preg_replace_callback(
"/([a-z]*)/",
function($matches){
foreach($matches as $match){
return strtoupper($match);
}
},
$input
);
On performance, there's no reason to use e
...
Unlike the mysql
libraries (which were also deprecated for security purposes), e
is not quicker than its alternatives for most operations. For the example given, it's twice as slow: preg_replace_callback (0.14 sec for 50,000 operations) vs e modifier (0.32 sec for 50,000 operations)
Solution 2:
The e
modifier is a PHP-specific modifier that triggers PHP to run the resulting string as PHP code. It is basically a eval()
wrapped inside a regex engine.
eval()
on its own is considered a security risk and a performance problem; wrapping it inside a regex amplifies both those issues significantly.
It is therefore considered bad practice, and is being formally deprecated as of the soon-to-be-released PHP v5.5.
PHP has provided for several versions now an alternative solution in the form of preg_replace_callback()
, which uses callback functions instead of using eval()
. This is the recommended method of doing this kind of thing.
With specific regard to the code you've quoted:
I don't see an e
modifier in the sample code you've given in the question. It has a slash at each end as the regex delimiter; the e
would have to be outside of that, and it isn't. Therefore I don't think the code you've quoted is likely to be directly vulnerable to having an e
modifier injected into it.
However, if $input
contains any /
characters, it will be vulnerable to being entirely broken (ie throwing an error due to invalid regex). The same would apply if it had anything else that made it an invalid regular expression.
Because of this, it is a bad idea to use an unvalidated user input string as part of a regex pattern - even if you are sure that it can't be hacked to use the e
modifier, there's plenty of other mischief that could be achieved with it.
Solution 3:
As explained in the manual, the /e
modifier actually evaluates the text the regular expression works on as PHP code. The example given in the manual is:
$html = preg_replace(
'(<h([1-6])>(.*?)</h\1>)e',
'"<h$1>" . strtoupper("$2") . "</h$1>"',
$html
);
This matches any "<hX>XXXXX</hX>
" text (i.e. headline HTML tags), replaces this text with "<hX>" . strtoupper("XXXXXX") . "<hX>"
, then executes "<hX>" . strtoupper("XXXXXX") . "<hX>"
as PHP code, then puts the result back into the string.
If you run this on arbitrary user input, any user has a chance to slip something in which will actually be evaluated as PHP code. If he does it correctly, the user can use this opportunity to execute any code he wants to. In the above example, imagine if in the second step the text would be "<hX>" . strtoupper("" . shell('rm -rf /') . "") . "<hX>"
.
Solution 4:
It's evil, that's all you need to know :p
More specifically, it generates the replacement string as normal, but then runs it through eval
.
You should use preg_replace_callback
instead.