How do I detect XML parsing errors when using Javascript's DOMParser in a cross-browser way?
It seems that all major browsers implement the DOMParser API so that XML can be parsed into a DOM and then queried using XPath, getElementsByTagName, etc...
However, detecting parsing errors seems to be trickier. DOMParser.prototype.parseFromString
always returns a valid DOM. When a parsing error occurs, the returned DOM contains a <parsererror>
element, but it's slightly different in each major browser.
Sample JavaScript:
xmlText = '<root xmlns="http://default" xmlns:other="http://other"><child><otherr:grandchild/></child></root>';
parser = new DOMParser();
dom = parser.parseFromString(xmlText, 'application/xml');
console.log((new XMLSerializer()).serializeToString(dom));
Result in Opera:
DOM's root is a <parsererror>
element.
<?xml version="1.0"?><parsererror xmlns="http://www.mozilla.org/newlayout/xml/parsererror.xml">Error<sourcetext>Unknown source</sourcetext></parsererror>
Result in Firefox:
DOM's root is a <parsererror>
element.
<?xml-stylesheet href="chrome://global/locale/intl.css" type="text/css"?>
<parsererror xmlns="http://www.mozilla.org/newlayout/xml/parsererror.xml">XML Parsing Error: prefix not bound to a namespace
Location: http://fiddle.jshell.net/_display/
Line Number 1, Column 64:<sourcetext><root xmlns="http://default" xmlns:other="http://other"><child><otherr:grandchild/></child></root>
---------------------------------------------------------------^</sourcetext></parsererror>
Result in Safari:
The <root>
element parses correctly but contains a nested <parsererror>
in a different namespace than Opera and Firefox's <parsererror>
element.
<root xmlns="http://default" xmlns:other="http://other"><parsererror xmlns="http://www.w3.org/1999/xhtml" style="display: block; white-space: pre; border: 2px solid #c77; padding: 0 1em 0 1em; margin: 1em; background-color: #fdd; color: black"><h3>This page contains the following errors:</h3><div style="font-family:monospace;font-size:12px">error on line 1 at column 50: Namespace prefix otherr on grandchild is not defined
</div><h3>Below is a rendering of the page up to the first error.</h3></parsererror><child><otherr:grandchild/></child></root>
Am I missing a simple, cross-browser way of detecting if a parsing error occurred anywhere in the XML document? Or must I query the DOM for each of the possible <parsererror>
elements that different browsers might generate?
Solution 1:
This is the best solution I've come up with.
I attempt to parse a string that is intentionally invalid XML and observe the namespace of the resulting <parsererror>
element. Then, when parsing actual XML, I can use getElementsByTagNameNS
to detect the same kind of <parsererror>
element and throw a Javascript Error
.
// My function that parses a string into an XML DOM, throwing an Error if XML parsing fails
function parseXml(xmlString) {
var parser = new DOMParser();
// attempt to parse the passed-in xml
var dom = parser.parseFromString(xmlString, 'application/xml');
if(isParseError(dom)) {
throw new Error('Error parsing XML');
}
return dom;
}
function isParseError(parsedDocument) {
// parser and parsererrorNS could be cached on startup for efficiency
var parser = new DOMParser(),
errorneousParse = parser.parseFromString('<', 'application/xml'),
parsererrorNS = errorneousParse.getElementsByTagName("parsererror")[0].namespaceURI;
if (parsererrorNS === 'http://www.w3.org/1999/xhtml') {
// In PhantomJS the parseerror element doesn't seem to have a special namespace, so we are just guessing here :(
return parsedDocument.getElementsByTagName("parsererror").length > 0;
}
return parsedDocument.getElementsByTagNameNS(parsererrorNS, 'parsererror').length > 0;
};
Note that this solution doesn't include the special-casing needed for Internet Explorer. However, things are much more straightforward in IE. XML is parsed with a loadXML
method which returns true or false if parsing succeeded or failed, respectively. See http://www.w3schools.com/xml/xml_parser.asp for an example.
Solution 2:
When I came here the first time, I upvoted original answer (by cspotcode), however, it does not work in Firefox. The resulting namespace is always "null" because of the structure of the produced document. I made a little research (check the code here). The idea is to use not
invalidXml.childNodes[0].namespaceURI
but
invalidXml.getElementsByTagName("parsererror")[0].namespaceURI
And then select "parsererror" element by namespace as in original answer. However, if you have a valid XML document with <parsererror>
tag in same namespace as used by browser, you end up with false alarm.
So, here's a heuristic to check if your XML parsed successfully:
function tryParseXML(xmlString) {
var parser = new DOMParser();
var parsererrorNS = parser.parseFromString('INVALID', 'application/xml').getElementsByTagName("parsererror")[0].namespaceURI;
var dom = parser.parseFromString(xmlString, 'application/xml');
if(dom.getElementsByTagNameNS(parsererrorNS, 'parsererror').length > 0) {
throw new Error('Error parsing XML');
}
return dom;
}
Why not implement exceptions in DOMParser?
Interesting thing worth mentioning in current context: if you try to get XML file with XMLHttpRequest
, parsed DOM will be stored in responseXML
property, or null
, if XML file content was invalid. Not an exception, not parsererror
or another specific indicator. Just null.