Escaping MySQL wild cards

Solution 1:

_ and % are not wildcards in MySQL in general, and should not be escaped for the purposes of putting them into normal string literals. mysql_real_escape_string is correct and sufficient for this purpose. addcslashes should not be used.

_ and % are special solely in the context of LIKE-matching. When you want to prepare strings for literal use in a LIKE statement, so that 100% matches one-hundred-percent and not just any string starting with a hundred, you have two levels of escaping to worry about.

The first is LIKE escaping. LIKE handling takes place entirely inside SQL, and if you want to turn a literal string into an literal LIKE expression you must perform this step even if you are using parameterised queries!

In this scheme, _ and % are special and must be escaped. The escape character must also be escaped. According to ANSI SQL, characters other than these must not be escaped: \' would be wrong. (Though MySQL will typically let you get away with it.)

Having done this, you proceed to the second level of escaping, which is plain old string literal escaping. This takes place outside of SQL, creating SQL, so must be done after the LIKE escaping step. For MySQL, this is mysql_real_escape_string as before; for other databases there will be a different function, of you can just use parameterised queries to avoid having to do it.

The problem that leads to confusion here is that in MySQL uses a backslash as an escape character for both of the nested escaping steps! So if you wanted to match a string against a literal percent sign you would have to double-backslash-escape and say LIKE 'something\\%'. Or, if that's in a PHP " literal which also uses backslash escaping, "LIKE 'something\\\\%'". Argh!

This is incorrect according to ANSI SQL, which says that: in string literals backslashes mean literal backslashes and the way to escape a single quote is ''; in LIKE expressions there is no escape character at all by default.

So if you want to LIKE-escape in a portable way, you should override the default (wrong) behaviour and specify your own escape character, using the LIKE ... ESCAPE ... construct. For sanity, we'll choose something other than the damn backslash!

function like($s, $e) {
    return str_replace(array($e, '_', '%'), array($e.$e, $e.'_', $e.'%'), $s);
}

$escapedname= mysql_real_escape_string(like($name, '='));
$query= "... WHERE name LIKE '%$escapedname%' ESCAPE '=' AND ...";

or with parameters (eg. in PDO):

$q= $db->prepare("... WHERE name LIKE ? ESCAPE '=' AND ...");
$q->bindValue(1, '%'.like($name, '=').'%', PDO::PARAM_STR);

(If you want more portability party time, you can also have fun trying to account for MS SQL Server and Sybase, where the [ character is also, incorrectly, special in a LIKE statement and has to be escaped. argh.)

Solution 2:

Surprised no one bothered to mention it after all these years, but if you don't need to do complex wildcard matching (e.g. foo%baz), I think INSTR/LOCATE/POSITION, LEFT, RIGHT, etc. should suffice. In all of my cases, I only used LIKE to match anywhere in a string (that is, for example %foobar%), so after all the horror stories about escaping LIKE patterns, I'm now using INSTR instead.

Equivalent of value LIKE '%foobar%' (match anywhere):

INSTR(value, 'foobar') > 0

Equivalent of value LIKE 'foobar%' (match at start):

INSTR(value, 'foobar') = 1

Equivalent of value LIKE '%foobar' (match at end):

RIGHT(value, 6) = 'foobar'

It might not be as straight-forward and easy to remember, and the solution for matching at the end could perhaps be improved somehow to be more universal. But these alternatives should hopefully at least give you some peace of mind in terms of security as it bypasses the need for any self-rolled escaping, and doesn't require you to alter the actual parameter values (when using prepared statements anyway).