Regular Expression For Duplicate Words

I'm a regular expression newbie, and I can't quite figure out how to write a single regular expression that would "match" any duplicate consecutive words such as:

Paris in the the spring.

Not that that is related.

Why are you laughing? Are my my regular expressions THAT bad??

Is there a single regular expression that will match ALL of the bold strings above?

Try this regular expression:


Here \b is a word boundary and \1 references the captured match of the first group.

I believe this regex handles more situations:


A good selection of test strings can be found here:

The below expression should work correctly to find any number of consecutive words. The matching can be case insensitive.

String regex = "\\b(\\w+)(\\s+\\1\\b)*";
Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);

Matcher m = p.matcher(input);

// Check for subsequences of input that match the compiled pattern
while (m.find()) {
     input = input.replaceAll(,;

Sample Input : Goodbye goodbye GooDbYe

Sample Output : Goodbye


The regex expression:

\b : Start of a word boundary

\w+ : Any number of word characters

(\s+\1\b)* : Any number of space followed by word which matches the previous word and ends the word boundary. Whole thing wrapped in * helps to find more than one repetitions.

Grouping : : Shall contain the matched group in above case Goodbye goodbye GooDbYe : Shall contain the first word of the matched pattern in above case Goodbye

Replace method shall replace all consecutive matched words with the first instance of the word.