Can ElementTree be told to preserve the order of attributes?

With help from @bobince's answer and these two (setting attribute order, overriding module methods)

I managed to get this monkey patched it's dirty and I'd suggest using another module that better handles this scenario but when that isn't a possibility:

# =======================================================================
# Monkey patch ElementTree
import xml.etree.ElementTree as ET

def _serialize_xml(write, elem, encoding, qnames, namespaces):
    tag = elem.tag
    text = elem.text
    if tag is ET.Comment:
        write("<!--%s-->" % ET._encode(text, encoding))
    elif tag is ET.ProcessingInstruction:
        write("<?%s?>" % ET._encode(text, encoding))
    else:
        tag = qnames[tag]
        if tag is None:
            if text:
                write(ET._escape_cdata(text, encoding))
            for e in elem:
                _serialize_xml(write, e, encoding, qnames, None)
        else:
            write("<" + tag)
            items = elem.items()
            if items or namespaces:
                if namespaces:
                    for v, k in sorted(namespaces.items(),
                                       key=lambda x: x[1]):  # sort on prefix
                        if k:
                            k = ":" + k
                        write(" xmlns%s=\"%s\"" % (
                            k.encode(encoding),
                            ET._escape_attrib(v, encoding)
                            ))
                #for k, v in sorted(items):  # lexical order
                for k, v in items: # Monkey patch
                    if isinstance(k, ET.QName):
                        k = k.text
                    if isinstance(v, ET.QName):
                        v = qnames[v.text]
                    else:
                        v = ET._escape_attrib(v, encoding)
                    write(" %s=\"%s\"" % (qnames[k], v))
            if text or len(elem):
                write(">")
                if text:
                    write(ET._escape_cdata(text, encoding))
                for e in elem:
                    _serialize_xml(write, e, encoding, qnames, None)
                write("</" + tag + ">")
            else:
                write(" />")
    if elem.tail:
        write(ET._escape_cdata(elem.tail, encoding))

ET._serialize_xml = _serialize_xml

from collections import OrderedDict

class OrderedXMLTreeBuilder(ET.XMLTreeBuilder):
    def _start_list(self, tag, attrib_in):
        fixname = self._fixname
        tag = fixname(tag)
        attrib = OrderedDict()
        if attrib_in:
            for i in range(0, len(attrib_in), 2):
                attrib[fixname(attrib_in[i])] = self._fixtext(attrib_in[i+1])
        return self._target.start(tag, attrib)

# =======================================================================

Then in your code:

tree = ET.parse(pathToFile, OrderedXMLTreeBuilder())

Nope. ElementTree uses a dictionary to store attribute values, so it's inherently unordered.

Even DOM doesn't guarantee you attribute ordering, and DOM exposes a lot more detail of the XML infoset than ElementTree does. (There are some DOMs that do offer it as a feature, but it's not standard.)

Can it be fixed? Maybe. Here's a stab at it that replaces the dictionary when parsing with an ordered one (collections.OrderedDict()).

from xml.etree import ElementTree
from collections import OrderedDict
import StringIO

class OrderedXMLTreeBuilder(ElementTree.XMLTreeBuilder):
    def _start_list(self, tag, attrib_in):
        fixname = self._fixname
        tag = fixname(tag)
        attrib = OrderedDict()
        if attrib_in:
            for i in range(0, len(attrib_in), 2):
                attrib[fixname(attrib_in[i])] = self._fixtext(attrib_in[i+1])
        return self._target.start(tag, attrib)

>>> xmlf = StringIO.StringIO('<a b="c" d="e" f="g" j="k" h="i"/>')

>>> tree = ElementTree.ElementTree()
>>> root = tree.parse(xmlf, OrderedXMLTreeBuilder())
>>> root.attrib
OrderedDict([('b', 'c'), ('d', 'e'), ('f', 'g'), ('j', 'k'), ('h', 'i')])

Looks potentially promising.

>>> s = StringIO.StringIO()
>>> tree.write(s)
>>> s.getvalue()
'<a b="c" d="e" f="g" h="i" j="k" />'

Bah, the serialiser outputs them in canonical order.

This looks like the line to blame, in ElementTree._write:

            items.sort() # lexical order

Subclassing or monkey-patching that is going to be annoying as it's right in the middle of a big method.

Unless you did something nasty like subclass OrderedDict and hack items to return a special subclass of list that ignores calls to sort(). Nah, probably that's even worse and I should go to bed before I come up with anything more horrible than that.

Proof: If n is a perfect square, $\,n+2\,$ is NOT a perfect square

Find $x$ such that $\sqrt{x+\sqrt{x+7}}\in \mathbb{N}$

Curious Integral Proof

There are no bearded men in the world - What goes wrong in this proof?

Approximate $\sqrt{e}$ by hand

What is the smallest number of people in a group, so that it is guaranteed that at least three of them will have their birthday in the same month?

"How long 'til we get there?" Road trip puzzle

Beautiful, simple proofs worthy of writing on this beautiful glass door [closed]

l'Hôpital vs Other Methods

How to prove that if the determinant of the matrix is zero then at least one eigenvalue must be zero? [duplicate]

Why aren't these negative numbers solutions for radical equations?

When, where and **how often** do you find polynomials of higher degrees than two in mathematical, pure/applied, research?