Case insensitive 'Contains(string)'
Is there a way to make the following return true?
string title = "ASTRINGTOTEST";
title.Contains("string");
There doesn't seem to be an overload that allows me to set the case sensitivity.. Currently I UPPERCASE them both, but that's just silly (by which I am referring to the i18n issues that come with up- and down casing).
UPDATE
This question is ancient and since then I have realized I asked for a simple answer for a really vast and difficult topic if you care to investigate it fully.
For most cases, in mono-lingual, English code bases this answer will suffice. I'm suspecting because most people coming here fall in this category this is the most popular answer.
This answer however brings up the inherent problem that we can't compare text case insensitive until we know both texts are the same culture and we know what that culture is. This is maybe a less popular answer, but I think it is more correct and that's why I marked it as such.
Solution 1:
You could use the String.IndexOf
Method and pass StringComparison.OrdinalIgnoreCase
as the type of search to use:
string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;
Even better is defining a new extension method for string:
public static class StringExtensions
{
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source?.IndexOf(toCheck, comp) >= 0;
}
}
Note, that null propagation ?.
is available since C# 6.0 (VS 2015), for older versions use
if (source == null) return false;
return source.IndexOf(toCheck, comp) >= 0;
USAGE:
string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);
Solution 2:
To test if the string paragraph
contains the string word
(thanks @QuarterMeister)
culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
Where culture
is the instance of CultureInfo
describing the language that the text is written in.
This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I
and i
for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.
Thus the strings tin
and TIN
are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)
To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture
, because it will be wrong in familiar ways.