regex to remove ordinals
You need to use a look-behind assertion so that only st|nd|rd|th
preceded by a [0-9]
are matched, but the [0-9]
isn't included in the match. i.e.:
(?<=[0-9])(?:st|nd|rd|th)
I've linked to the perl-compatible syntax, but if you're using posix, posix extended, vi or one of many other regex syntaxes you'll need to look up the syntax.
In perl:
$var =~ s{\b(\d+)(?:st|nd|rd|th)\b}{$1};
In PHP:
$var = preg_replace('/\\b(\d+)(?:st|nd|rd|th)\\b/', '$1', $var);
In .NET:
var = Regex.Replace(@"\b(\d+)(?:st|nd|rd|th)\b", "$1");
If you want to remove as well the numbers followed by ordinals you could use this one:
[0-9]+(?:st| st|nd| nd|rd| rd|th| th)
So for a given text: "The 3rd person is missing but the 2 nd and the 1st is here" you'll have this output: "The person is missing but the and the is here"
Try a negative lookbehind:
(?<=[0-9])(?:st|nd|rd|th)
assuming the dialect of regex supports it.