re.search() only matches the first occurrence

I'm trying to match the pattern:

<--Header Title-->
some body text

The following only matches the first occurrence:

string1 = """<-- Option 1 -->
Nice text
<--Final stuff-->
Listing all
of
the
text
"""

regex = re.compile(r"<--([\w\s]+)-->([\s\S]*?)(?=\n<--|$)") 
m = regex.search(string1)
print m.groups()

Which results in:

(' Option 1 ', '\nNice text')

However, it seems to work fine using pythex.

What am I doing wrong?


Solution 1:

Re.search only matches the first occurrence within the string. You want finditer or findall.

re.search

Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

Finditer returns match objects for all locations within the target string, yielding an iterator, while findall returns the substrings for all matches.

>>> import re
>>> re.findall('a', 'ababababa')
['a', 'a', 'a', 'a', 'a']

>>> x = list(re.finditer('a', 'ababababa'))
>>> x
[<_sre.SRE_Match object; span=(0, 1), match='a'>,
 <_sre.SRE_Match object; span=(2, 3), match='a'>,
 <_sre.SRE_Match object; span=(4, 5), match='a'>,
 <_sre.SRE_Match object; span=(6, 7), match='a'>,
 <_sre.SRE_Match object; span=(8, 9), match='a'>]
>>> x[0].group()
'a'