What is the difference between SAX and DOM?

I read some articles about the XML parsers and came across SAX and DOM.

SAX is event-based and DOM is tree model -- I don't understand the differences between these concepts.

From what I have understood, event-based means some kind of event happens to the node. Like when one clicks a particular node it will give all the sub nodes rather than loading all the nodes at the same time. But in the case of DOM parsing it will load all the nodes and make the tree model.

Is my understanding correct?

Please correct me If I am wrong or explain to me event-based and tree model in a simpler manner.


Well, you are close.

In SAX, events are triggered when the XML is being parsed. When the parser is parsing the XML, and encounters a tag starting (e.g. <something>), then it triggers the tagStarted event (actual name of event might differ). Similarly when the end of the tag is met while parsing (</something>), it triggers tagEnded. Using a SAX parser implies you need to handle these events and make sense of the data returned with each event.

In DOM, there are no events triggered while parsing. The entire XML is parsed and a DOM tree (of the nodes in the XML) is generated and returned. Once parsed, the user can navigate the tree to access the various data previously embedded in the various nodes in the XML.

In general, DOM is easier to use but has an overhead of parsing the entire XML before you can start using it.


In just a few words...

SAX (Simple API for XML): Is a stream-based processor. You only have a tiny part in memory at any time and you "sniff" the XML stream by implementing callback code for events like tagStarted() etc. It uses almost no memory, but you can't do "DOM" stuff, like use xpath or traverse trees.

DOM (Document Object Model): You load the whole thing into memory - it's a massive memory hog. You can blow memory with even medium sized documents. But you can use xpath and traverse the tree etc.


Here in simpler words:

DOM

  • Tree model parser (Object based) (Tree of nodes).

  • DOM loads the file into the memory and then parse- the file.

  • Has memory constraints since it loads the whole XML file before parsing.

  • DOM is read and write (can insert or delete nodes).

  • If the XML content is small, then prefer DOM parser.

  • Backward and forward search is possible for searching the tags and evaluation of the information inside the tags. So this gives the ease of navigation.

  • Slower at run time.

SAX

  • Event based parser (Sequence of events).

  • SAX parses the file as it reads it, i.e. parses node by node.

  • No memory constraints as it does not store the XML content in the memory.

  • SAX is read only i.e. can’t insert or delete the node.

  • Use SAX parser when memory content is large.

  • SAX reads the XML file from top to bottom and backward navigation is not possible.

  • Faster at run time.