Split string using regular expression, how to ignore apostrophe?
First you'll need to fix the original expression by replacing )
with ]
as mentioned by Marcin. Then simply add '
to the list of allowed characters (escaped by a back-slash):
import re
def split_line(line):
return re.findall('[A-Za-z\']+(?:\`[A-Za-z]+)?',line)
split_line("He's my hero")
#["He's", 'my', 'hero']
Of course, this will not consider any edge cases where the apostrophe is at the beginning or at the end of a word.