When not to use Regex in C# (or Java, C++, etc.)

It is clear that there are lots of problems that look like a simple regex expression will solve, but which prove to be very hard to solve with regex.

So how does someone that is not an expert in regex, know if he/she should be learning regex to solve a given problem?

(See "Regex to parse C# source code to find all strings" for way I am asking this question.)

This seems to sums it up well:

Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems...

(I have just changed the title of the question to make it more specific, as some of the problems with Regex in C# are solved in Perl and JScript, for example the fact that the two levels of quoting makes a Regex so unreadable.)


Don't try to use regex to parse hierarchical text like program source (or nested XML): they are proven to be not powerful enough for that, for example, they can't, for a string of parens, figure out whether they're balanced or not.

Use parser generators (or similar technologies) for that.

Also, I'd not recommend using regex to validate data with strict formal standards, like e-mail addresses. They're harder than you want, and you'll either have unaccurate or a very long regex.


There are two aspects to consider:

  • Capability: is the language you are trying to recognize a Type-3 language (a regular one)? if so, then you might use regex, if not, you need a more powerful tool.

  • Maintainability: If it takes more time write, test and understand a regular expression than its programmatic counterpart, then it's not appropriate. How to check this is complicated, I'd recommend peer review with your fellows (if they say "what the ..." when they see it, then it's too complicated) or just leave it undocumented for a few days and then take a look by yourself and measure how long does it take to understand it.


I'm a beginner when it comes to regex, but IMHO it is worthwhile to spend some time learning basic regex, you'll realise that many, many problems you've solved differently could (and maybe should) be solved using regex.

For a particular problem, try to find a solution at a site like regexlib, and see if you can understand the solution.

As indicated above, regex might not be sufficient to solve a specific problem, but browsing a browsing a site like regexlib will certainly tell you if regex is the right solution to your problem.