How to convert UTF-8 notation to python unicode notation
you are struggling with the representation of something versus its value...
import re
re.sub("u\+([0-9a-f]{4})",lambda m:chr(int(m.group(1),16)),s)
but for u+00a0 this becomes \xa0
but same with the literal \u00a0
s = "\u00a0"
print(repr(s))
once you have the proper value as a unicode string you can then encode it to utf8
s = "\xa0"
print(s.encode('utf8'))
# b'\xc2\xa0'
so just final answer here
import re
s = "u+00a0"
s2 = re.sub("u\+([0-9a-f]{4})",lambda m:chr(int(m.group(1),16)),s)
s_bytes = s2.encode('utf8') # b'\xc2\xa0'
You can also use this:
>>> s = 'U+00A0'
>>> s = s.replace('U+', '\\u').encode().decode('unicode_escape').encode()
>>> s
b'\xc2\xa0'