Python re.sub back reference not back referencing [duplicate]
Solution 1:
You need to use a raw-string here so that the backslash isn't processed as an escape character:
>>> import re
>>> fileText = '<text top="52" left="20" width="383" height="15" font="0"><b>test</b></text>'
>>> fileText = re.sub("<b>(.*?)</b>", r"\1", fileText, flags=re.DOTALL)
>>> fileText
'<text top="52" left="20" width="383" height="15" font="0">test</text>'
>>>
Notice how "\1"
was changed to r"\1"
. Though it is a very small change (one character), it has a big effect. See below:
>>> "\1"
'\x01'
>>> r"\1"
'\\1'
>>>