How to split a string with parentheses and spaces into a list
Using [\w \w]*
is the same as [\w ]*
and also matches an empty string.
Instead of using split, you can use re.findall without any capture groups and write the pattern like:
\(\w+(?:[^\S\n]+\w+)*\)|\w+
-
\(
Match(
-
\w+
Match 1+ word chars -
(?:[^\S\n]+\w+)*
Optionally repeat matching spaces and 1+ word chars
-
-
\)
Match)
-
|
Or -
\w+
Match 1+ word chars
Regex demo
import re
string = "(so) what (are you trying to say)? what (do you mean)"
rx = re.compile(r"\(\w+(?:[^\S\n]+\w+)*\)|\w+")
print(re.findall(rx, string))
Output
['(so)', 'what', '(are you trying to say)', 'what', '(do you mean)']