How to write a recursive regex that matches nested parentheses?

Solution 1:

This pattern works:

$pattern = '~ \( (?: [^()]+ | (?R) )*+ \) ~x';

The content inside parenthesis is simply describe:

"all that is not parenthesis OR recursion (= other parenthesis)" x 0 or more times

If you want to catch all substrings inside parenthesis, you must put this pattern inside a lookahead to obtain all overlapping results:

$pattern = '~(?= ( \( (?: [^()]+ | (?1) )*+ \) ) )~x';
preg_match_all($pattern, $subject, $matches);
print_r($matches[1]);

Note that I have added a capturing group and I have replaced (?R) by (?1):

(?R) -> refers to the whole pattern (You can write (?0) too)
(?1) -> refers to the first capturing group

What is this lookahead trick?

A subpattern inside a lookahead (or a lookbehind) doesn't match anything, it's only an assertion (a test). Thus, it allows to check the same substring several times.

If you display the whole pattern results (print_r($matches[0]);), you will see that all results are empty strings. The only way to obtain the substrings found by the subpattern inside the lookahead is to enclose the subpattern in a capturing group.

Note: the recursive subpattern can be improved like this:

\( [^()]*+ (?: (?R) [^()]* )*+ \)

Solution 2:

When I found this answer I wasn't able to figure out how to modify the pattern to work with my own delimiters which where { and }. So my approach was to make it more generic.

Here is a script to generate the regex pattern with your own variable left and right delimiters.

$delimiter_wrap  = '~';
$delimiter_left  = '{';/* put YOUR left delimiter here.  */
$delimiter_right = '}';/* put YOUR right delimiter here. */

$delimiter_left  = preg_quote( $delimiter_left,  $delimiter_wrap );
$delimiter_right = preg_quote( $delimiter_right, $delimiter_wrap );
$pattern         = $delimiter_wrap . $delimiter_left
                 . '((?:[^' . $delimiter_left . $delimiter_right . ']++|(?R))*)'
                 . $delimiter_right . $delimiter_wrap;

/* Now you can use the generated pattern. */
preg_match_all( $pattern, $subject, $matches );