What's the Time Complexity of Average Regex algorithms?
This is one of the most popular outlines: Regular Expression Matching Can Be Simple And Fast . Running a DFA-compiled regular expression against a string is indeed O(n), but can require up to O(2^m) construction time/space (where m = regular expression size).
Are you familiar with the term Deterministic/Non-Deterministic Finite Automata?
Real regular expressions (when I say real I'm refering to those regex that recognize Regular Languages, and not the regex that almost every programming language include with backreferences, etc) can be converted into a DFA/NFA and both can be implemented in a mechanical way in a programming language (a NFA can be converted into a DFA)
What you have to do is:
- Find a way to convert a regex into an automaton
- Implement the recognition of the automaton in the programming language of your preference
That way, given a regex you can convert it to a DFA and run it to see if it matches or not a specified text.
This can be implemented in O(n)
, because DFA don't go backward (like a Turing Machine), so it matches the string or not. That is supposing you won't take in count overlapped matches, otherwise you will have to go back and start matching again...