.NET method to convert a string to sentence case
I'm looking for a function to convert a string of text that is in UpperCase to SentenceCase. All the examples I can find turn the text into TitleCase.
Sentence case in a general sense describes the way that capitalization is used within a sentence. Sentence case also describes the standard capitalization of an English sentence, i.e. the first letter of the sentence is capitalized, with the rest being lower case (unless requiring capitalization for a specific reason, e.g. proper nouns, acronyms, etc.).
Can anyone point me in the direction of a script or function for SentenceCase?
There isn't anything built in to .NET - however, this is one of those cases where regular expression processing actually may work well. I would start by first converting the entire string to lower case, and then, as a first approximation, you could use regex to find all sequences like [a-z]\.\s+(.)
, and use ToUpper()
to convert the captured group to upper case. The RegEx
class has an overloaded Replace()
method which accepts a MatchEvaluator
delegate, which allows you to define how to replace the matched value.
Here's a code example of this at work:
var sourcestring = "THIS IS A GROUP. OF CAPITALIZED. LETTERS.";
// start by converting entire string to lower case
var lowerCase = sourcestring.ToLower();
// matches the first sentence of a string, as well as subsequent sentences
var r = new Regex(@"(^[a-z])|\.\s+(.)", RegexOptions.ExplicitCapture);
// MatchEvaluator delegate defines replacement of setence starts to uppercase
var result = r.Replace(lowerCase, s => s.Value.ToUpper());
// result is: "This is a group. Of uncapitalized. Letters."
This could be refined in a number of different ways to better match a broader variety of sentence patterns (not just those ending in a letter+period).
This works for me.
/// <summary>
/// Converts a string to sentence case.
/// </summary>
/// <param name="input">The string to convert.</param>
/// <returns>A string</returns>
public static string SentenceCase(string input)
{
if (input.Length < 1)
return input;
string sentence = input.ToLower();
return sentence[0].ToString().ToUpper() +
sentence.Substring(1);
}
There is a built in ToTitleCase()
function that will be extended to support multiple cultures in future.
Example from MSDN:
using System;
using System.Globalization;
public class Example
{
public static void Main()
{
string[] values = { "a tale of two cities", "gROWL to the rescue",
"inside the US government", "sports and MLB baseball",
"The Return of Sherlock Holmes", "UNICEF and children"};
TextInfo ti = CultureInfo.CurrentCulture.TextInfo;
foreach (var value in values)
Console.WriteLine("{0} --> {1}", value, ti.ToTitleCase(value));
}
}
// The example displays the following output:
// a tale of two cities --> A Tale Of Two Cities
// gROWL to the rescue --> Growl To The Rescue
// inside the US government --> Inside The US Government
// sports and MLB baseball --> Sports And MLB Baseball
// The Return of Sherlock Holmes --> The Return Of Sherlock Holmes
// UNICEF and children --> UNICEF And Children
While it is generally useful it has some important limitations:
Generally, title casing converts the first character of a word to uppercase and the rest of the characters to lowercase. However, this method does not currently provide proper casing to convert a word that is entirely uppercase, such as an acronym. The following table shows the way the method renders several strings.
...the
ToTitleCase
method provides an arbitrary casing behavior which is not necessarily linguistically correct. A linguistically correct solution would require additional rules, and the current algorithm is somewhat simpler and faster. We reserve the right to make this API slower in the future.
Source: http://msdn.microsoft.com/en-us/library/system.globalization.textinfo.totitlecase.aspx