Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"
Solution 1:
Instead of modifying the XML document itself, it's best to parse it and then modify the tags in the result. This way you can handle multiple namespaces and namespace aliases:
from io import StringIO # for Python 2 import from StringIO instead
import xml.etree.ElementTree as ET
# instead of ET.fromstring(xml)
it = ET.iterparse(StringIO(xml))
for _, el in it:
prefix, has_namespace, postfix = el.tag.partition('}')
if has_namespace:
el.tag = postfix # strip all namespaces
root = it.root
This is based on the discussion here: http://bugs.python.org/issue18304
Update: rpartition
instead of partition
makes sure you get the tag name in postfix
even if there is no namespace. Thus you could condense it:
for _, el in it:
_, _, el.tag = el.tag.rpartition('}') # strip ns
Solution 2:
If you remove the xmlns attribute from the xml before parsing it then there won't be a namespace prepended to each tag in the tree.
import re
xmlstring = re.sub(' xmlns="[^"]+"', '', xmlstring, count=1)