Python regular expression find and output a part of the pattern in multiple times
Use re.findall
:
inp = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'
matches = re.findall(r'\d{4}/\d{2}/\d{2}(?: \d{1,2}:\d{2})*', inp)
print(matches)
This prints:
['2021/12/23 13:00 14:00', '2021/12/24 13:00 14:00 15:00']
Explanation of regex:
\d{4}/\d{2}/\d{2} match a date in YYYY/MM/DD format
(?: \d{1,2}:\d{2})* match a space followed by hh:mm time, 0 or more times
You can use this findall + split
solution:
import re
s = '2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'
for i in re.findall(r'\d+/\d+/\d+(?:\s\d+\:\d+)+', s): print (i.split())
Output:
['2021/12/23', '13:00', '14:00']
['2021/12/24', '13:00', '14:00', '15:00']
Code Demo
\d+/\d+/\d+(?:\s\d+\:\d+)+
matches a date string followed by 1 or more time strings.
You. could also use:
print ([i.split() for i in re.findall(r'\d+/\d+/\d+(?:\s\d+\:\d+)+', s)])
To get output:
[['2021/12/23', '13:00', '14:00'], ['2021/12/24', '13:00', '14:00', '15:00']]
You can use PyPi regex library to get the following to work:
import regex
pattern = regex.compile(r'(?P<date>\d+/\d+/\d+)(?:\s+(?P<time>\d+:\d+))+')
for m in pattern.finditer('2021/12/23 13:00 14:00 2021/12/24 13:00 14:00 15:00'):
print(m.capturesdict())
Output:
{'date': ['2021/12/23'], 'time': ['13:00', '14:00']}
{'date': ['2021/12/24'], 'time': ['13:00', '14:00', '15:00']}
See the Python demo.
Since PyPi regex library does not "forget" all captures inside a group, and provided the groups are named, the match.capturesdict()
returns the dictionary of all groups with their captures.