Clean up a comma-separated list by regex

Solution 1:

You can use:

[\h*([,\h])[,\h]*

See an online demo. Or alternatively:

\h*([,\h])(?1)*

See an online demo


  • \h* - 0+ (Greedy) horizontal-whitespace chars;
  • ([,\h]) - A 1st capture group to match a comma or horizontal-whitespace;
  • [,\h]* - Option 1: 0+ (Greedy) comma's or horizontal-whitespace chars;
  • (?1)* - Option 2: Recurse the 1st subpattern 0+ (Greedy) times.

Replace with the 1st capture group:

$str='first , second ,, third, ,fourth   suffix';
echo preg_replace('~\h*([,\h])[,\h]*~', '$1', $str);
echo preg_replace('~\h*([,\h])(?1)*~', '$1', $str);

Both print:

first,second,third,fourth suffix

Solution 2:

You can use

preg_replace('~\s*(?:(,)\s*)+|(\s)+~', '$1$2', $str)

Merging the two alternatives into one results in

preg_replace('~\s*(?:([,\s])\s*)+~', '$1', $str)

See the regex demo and the PHP demo. Details:

  • \s*(?:(,)\s*)+ - zero or more whitespaces and then one or more occurrences of a comma (captured into Group 1 ($1)) and then zero or more whitespaces
  • | - or
  • (\s)+ - one or more whitespaces while capturing the last one into Group 2 ($2).

In the second regex, ([,\s]) captures a single comma or a whitespace character.