How to check that a string is a palindrome using regular expressions?
That was an interview question that I was unable to answer:
How to check that a string is a palindrome using regular expressions?
p.s. There is already a question "How to check if the given string is palindrome?" and it gives a lot of answers in different languages, but no answer that uses regular expressions.
Solution 1:
The answer to this question is that "it is impossible". More specifically, the interviewer is wondering if you paid attention in your computational theory class.
In your computational theory class you learned about finite state machines. A finite state machine is composed of nodes and edges. Each edge is annotated with a letter from a finite alphabet. One or more nodes are special "accepting" nodes and one node is the "start" node. As each letter is read from a given word we traverse the given edge in the machine. If we end up in an accepting state then we say that the machine "accepts" that word.
A regular expression can always be translated into an equivalent finite state machine. That is, one that accepts and rejects the same words as the regular expression (in the real world, some regexp languages allow for arbitrary functions, these don't count).
It is impossible to build a finite state machine that accepts all palindromes. The proof relies on the facts that we can easily build a string that requires an arbitrarily large number of nodes, namely the string
a^x b a^x (eg., aba, aabaa, aaabaaa, aaaabaaaa, ....)
where a^x is a repeated x times. This requires at least x nodes because, after seeing the 'b' we have to count back x times to make sure it is a palindrome.
Finally, getting back to the original question, you could tell the interviewer that you can write a regular expression that accepts all palindromes that are smaller than some finite fixed length. If there is ever a real-world application that requires identifying palindromes then it will almost certainly not include arbitrarily long ones, thus this answer would show that you can differentiate theoretical impossibilities from real-world applications. Still, the actual regexp would be quite long, much longer than equivalent 4-line program (easy exercise for the reader: write a program that identifies palindromes).
Solution 2:
While the PCRE engine does support recursive regular expressions (see the answer by Peter Krauss), you cannot use a regex on the ICU engine (as used, for example, by Apple) to achieve this without extra code. You'll need to do something like this:
This detects any palindrome, but does require a loop (which will be required because regular expressions can't count).
$a = "teststring";
while(length $a > 1)
{
$a =~ /(.)(.*)(.)/;
die "Not a palindrome: $a" unless $1 eq $3;
$a = $2;
}
print "Palindrome";
Solution 3:
It's not possible. Palindromes aren't defined by a regular language. (See, I DID learn something in computational theory)
Solution 4:
With Perl regex:
/^((.)(?1)\2|.?)$/
Though, as many have pointed out, this can't be considered a regular expression if you want to be strict. Regular expressions does not support recursion.