Regex.IsMatch vs string.Contains
For simple cases String.Contains
will give you better performance but String.Contains
will not allow you to do complex pattern matching. Use String.Contains
for non-pattern matching scenarios (like the one in your example) and use regular expressions for scenarios in which you need to do more complex pattern matching.
A regular expression has a certain amount of overhead associated with it (expression parsing, compilation, execution, etc.) that a simple method like String.Contains
simply does not have which is why String.Contains
will outperform a regular expression in examples like yours.
String.Contains
is slower when you compare it to a compiled regular expression. Considerably slower, even!
You can test it running this benchmark:
class Program
{
public static int FoundString;
public static int FoundRegex;
static void DoLoop(bool show)
{
const string path = "C:\\file.txt";
const int iterations = 1000000;
var content = File.ReadAllText(path);
const string searchString = "this exists in file";
var searchRegex = new Regex("this exists in file");
var containsTimer = Stopwatch.StartNew();
for (var i = 0; i < iterations; i++)
{
if (content.Contains(searchString))
{
FoundString++;
}
}
containsTimer.Stop();
var regexTimer = Stopwatch.StartNew();
for (var i = 0; i < iterations; i++)
{
if (searchRegex.IsMatch(content))
{
FoundRegex++;
}
}
regexTimer.Stop();
if (!show) return;
Console.WriteLine("FoundString: {0}", FoundString);
Console.WriteLine("FoundRegex: {0}", FoundRegex);
Console.WriteLine("containsTimer: {0}", containsTimer.ElapsedMilliseconds);
Console.WriteLine("regexTimer: {0}", regexTimer.ElapsedMilliseconds);
Console.ReadLine();
}
static void Main(string[] args)
{
DoLoop(false);
DoLoop(true);
return;
}
}
To determine which is the fastest you will have to benchmark your own system. However, regular expressions are complex and chances are that String.Contains()
will be the fastest and in your case also the simplest solution.
The implementation of String.Contains()
will eventually call the native method IndexOfString()
and the implementation of that is only known by Microsoft. However, a good algorithm for implementing this method is using what is known as the Knuth–Morris–Pratt algorithm. The complexity of this algorithm is O(m + n) where m is the length of the string you are searching for and n is the length of the string you are searching making it a very efficient algorithm.
Actually, the efficiency of search using regular expression can be as low O(n) depending on the implementation so it may still be competetive in some situations. Only a benchmark will be able to determine this.
If you are really concerned about search speed Christian Charras and Thierry Lecroq has a lot of material about exact string matching algorithms at Université de Rouen.