preg_match(): Compilation failed: invalid range in character class at offset

Thank you in advance for you time in helping with this issue..

preg_match(): Compilation failed: invalid range in character class at offset 20 session.php on line 278

This stopped working all of a sudden after months of working, after a PHP upgrade on our server.

Here is the code

    else{
     /* Spruce up username, check length */
     $subuser = stripslashes($subuser);
     if(strlen($subuser) < $config['min_user_chars']){
        $form->setError($field, "* Username below ".$config['min_user_chars']."characters");
     }
     else if(strlen($subuser) > $config['max_user_chars']){
        $form->setError($field, "* Username above ".$config['max_user_chars']."characters");
     }


     /* Check if username is not alphanumeric */
    /* PREG_MATCH CODE */

     else if(!preg_match("/^[a-z0-9]([0-9a-z_-\s])+$/i", $subuser)){        
        $form->setError($field, "* Username not alphanumeric");
     }


    /* PREG_MATCH CODE */


     /* Check if username is reserved */
     else if(strcasecmp($subuser, GUEST_NAME) == 0){
        $form->setError($field, "* Username reserved word");
     }
     /* Check if username is already in use */
     else if($database->usernameTaken($subuser)){
        $form->setError($field, "* Username already in use");
     }
     /* Check if username is banned */
     else if($database->usernameBanned($subuser)){
        $form->setError($field, "* Username banned");
     }
  }

Solution 1:

The problem is really old, but there are some new developments related to PHP 7.3 and newer versions that need to be covered. PHP PCRE engine migrates to PCRE2, and the PCRElibrary version used in PHP 7.3 is 10.32, and that is where Backward Incompatible Changes originate from:

  • Internal library API has changed
  • The 'S' modifier has no effect, patterns are studied automatically. No real impact.
  • The 'X' modifier is the default behavior in PCRE2. The current patch reverts the behavior to the meaning of 'X' how it was in PCRE, but it might be better to go with the new behavior and have 'X' turned on by default. So currently no impact, too.
  • Some behavior change due to the newer Unicode engine was sighted. It's Unicode 10 in PCRE2 vs Unicode 7 in PCRE. Some behavior change can be sighted with invalid patterns.

Acc. to the PHP 10.33 changelog:

  1. With PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL set, escape sequences such as \s which are valid in character classes, but not as the end of ranges, were being treated as literals. An example is [_-\s] (but not [\s-_] because that gave an error at the start of a range). Now an "invalid range" error is given independently of PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.

Before PHP 7.3, you might use the hyphen in a character class in any position if you escaped it, or if you put it "in a position where it cannot be interpreted as indicating a range". In PHP 7.3, it seems the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL was set to false. So, from now on, in order to put hyphen into a character class, always use it either at the start or end positions only.

See also this reference:

In simple words,

PCRE2 is more strict in the pattern validations, so after the upgrade, some of your existing patterns could not compile anymore.

Here is the simple snippet used in php.net

preg_match('/[\w-.]+/', ''); // this will not work in PHP7.3
preg_match('/[\w\-.]+/', ''); // the hyphen need to be escaped

As you can see from the example above there is a little but substantial difference between the two lines.

Solution 2:

A character class range is defined by using - between two values in a character class ([] in regex). [0-9] means everything between 0 and 9, inclusive. In the regular expression in your code, you have several character class ranges, a-z, 0-9. There is also one class that you probably didn't mean to put there, namely _-\s.

"/^[a-z0-9]([0-9a-z_-\s])+$/i"
                   ^^^^ 

This is apprently not considered an invalid character range in some (most?) versions of PCRE (the regular expression library PHP uses), but it might have changed recently, and if the PCRE library was upgraded on the server, that might be the reason.

Debuggex is a nice tool that can help debug errors (well, the error message from PHP told you both the line and the character where the error was, so..) like this (I'm not affiliated, just a fan).

Solution 3:

Your error is dependent on your regex interpreter.

You can escape hyphen to be clear about its use. Meaning using \- instead of -.

Your final code:

/^[a-z0-9]([0-9a-z_\-\s])+$/i