Capturing group with findall?

How can I access captured groups if I do findall(r'regex(with)capturing.goes.here') ? I know I can do it through finditer, but I don't want to iterate.


Solution 1:

findall just returns the captured groups:

>>> re.findall('abc(de)fg(123)', 'abcdefg123 and again abcdefg123')
[('de', '123'), ('de', '123')]

Relevant doc excerpt:

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

Solution 2:

Use groups freely. The matches will be returned as a list of group-tuples:

>>> re.findall('(1(23))45', '12345')
[('123', '23')]

If you want the full match to be included, just enclose the entire regex in a group:

>>> re.findall('(1(23)45)', '12345')
[('12345', '23')]

Solution 3:

import re
string = 'Perotto, Pier Giorgio'
names = re.findall(r'''
                 (?P<first>[-\w ]+),\s #first name
                 (?P<last> [-\w ]+) #last name
                 ''',string, re.X|re.M)

print(names)

returns

[('Perotto', 'Pier Giorgio')]

re.M would make sense if your string is multiline. Also you need VERBOSE (equal to re.X) mode in the regex I've written because it is using '''