Compile regex in PHP

Solution 1:

The Perl-Compatible Regular Expressions library may have already be optimized for your use case without providing a Regex class like other languages do:

This extension maintains a global per-thread cache of compiled regular expressions (up to 4096).

PCRE Introduction

This is how the study modifier which Imran described can store the compiled expression between calls.

Solution 2:

preg regexes can use the uppercase S (study) modifier, which is probably the thing you're looking for.

http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php

S

When a pattern is going to be used several times, it is worth spending more time analyzing it in order to speed up the time taken for matching. If this modifier is set, then this extra analysis is performed. At present, studying a pattern is useful only for non-anchored patterns that do not have a single fixed starting character.

Solution 3:

Thread is the thread that the script is currently running in. After first use, compiled regexp is cached and next time it is used PHP does not compile it again.

Simple test:

<?php

function microtime_float() {
    list($usec, $sec) = explode(" ", microtime());
    return ((float)$usec + (float)$sec);
}

// test string
$text='The big brown <b>fox</b> jumped over a lazy <b>cat</b>';
$testTimes=10;


$avg=0;
for ($x=0; $x<$testTimes; $x++)
{
    $start=microtime_float();
    for ($i=0; $i<10000; $i++) {
        preg_match_all('/<b>(.*)<\/b>0?/', $text, $m);
    }
    $end=microtime_float();
    $avg += (float)$end-$start;
}

echo 'Regexp with caching avg '.($avg/$testTimes);

// regexp without caching
$avg=0;
for ($x=0; $x<$testTimes; $x++)
{
    $start=microtime_float();
    for ($i=0; $i<10000; $i++) {
        $pattern='/<b>(.*)<\/b>'.$i.'?/';
        preg_match_all($pattern, $text, $m);
    }
    $end=microtime_float();
    $avg += (float)$end-$start;
}

echo '<br/>Regexp without caching avg '.($avg/$testTimes);

Regexp with caching avg 0.1 Regexp without caching avg 0.8

Caching a regexp makes it 8 times faster!