How to split a string with parentheses and spaces into a list

Using [\w \w]* is the same as [\w ]* and also matches an empty string.

Instead of using split, you can use re.findall without any capture groups and write the pattern like:

\(\w+(?:[^\S\n]+\w+)*\)|\w+
  • \( Match (
    • \w+ Match 1+ word chars
    • (?:[^\S\n]+\w+)* Optionally repeat matching spaces and 1+ word chars
  • \) Match )
  • | Or
  • \w+ Match 1+ word chars

Regex demo

import re
string = "(so) what (are you trying to say)? what (do you mean)"

rx = re.compile(r"\(\w+(?:[^\S\n]+\w+)*\)|\w+")

print(re.findall(rx, string))

Output

['(so)', 'what', '(are you trying to say)', 'what', '(do you mean)']