I'm in the position to parse XML in .NET. Now I have the choice between at least XmlTextReader and XDocument. Are there any comparisons between those two (or any other XML parsers contained in the framework)?

Maybe this could help me to decide without trying both of them in depth.

The XML files are expected to be rather small, speed and memory usage are a minor issue compared to easiness of use. :-)

(I'm going to use them from C# and/or IronPython.)

Thanks!


Solution 1:

If you're happy reading everything into memory, use XDocument. It'll make your life much easier. LINQ to XML is a lovely API.

Use an XmlReader (such as XmlTextReader) if you need to handle huge XML files in a streaming fashion, basically. It's a much more painful API, but it allows streaming (i.e. only dealing with data as you need it, so you can go through a huge document and only have a small amount in memory at a time).

There's a hybrid approach, however - if you have a huge document made up of small elements, you can create an XElement from an XmlReader positioned at the start of the element, deal with the element using LINQ to XML, then move the XmlReader onto the next element and start again.

Solution 2:

XmlTextReader is kind of deprecated, do not use it.

  1. From msdn blogs by XmlTeam

    Effective Xml Part 1: Choose the right API

    Avoid using XmlTextReader. It contains quite a few bugs that could not be fixed without breaking existing applications already using it.

    The world has moved on, have you? Xml APIs you should avoid using.

    Obsolete APIs are easy since the compiler helps identifying them but there are two more APIs you should avoid using – namely XmlTextReader and XmlTextWriter. We found a number of bugs in these classes which we could not fix without breaking existing applications. The easy route would be to deprecate these classes and ask people to use replacement APIs instead. Unfortunately these two classes cannot be marked as obsolete because they are part of ECMA-335 (Common Language Infrastructure) standard (http://www.ecma-international.org/publications/standards/Ecma-335.htm) – the companion CLILibrary.xml file which is a part of Partition IV).

    The good news is that even though these classes are not deprecated there are replacement APIs for these in .NET Framework already and moving to them is relatively easy. First it is necessary to find the places where XmlTextReader or XmlTextWriter is being used (unfortunately it is a manual step). Now all the occurrences of XmlTextReader should be replaced with XmlReader and all the occurrences of XmlTextWriter should be replaced with XmlWriter (note that XmlTextReader derives from XmlReader and XmlTextWriter derives from XmlWriter so the app can already be using these e.g. as formal parameters). The last step is to change the way the XmlReader/XmlWriter objects are instantiated – instead of creating the reader/writer directly it is necessary to the static factory method .Create() present on both XmlReader and XmlWriter APIs.

  2. Furthermore, intellisense in Visual Studio doesn't list XmlTextReader under System.Xml namespace. The class is defined as:

    [EditorBrowsable(EditorBrowsableState.Never)]
    public class XmlTextReader : XmlReader, IXmlLineInfo, IXmlNamespaceResolver
    

The XmlReader.Create factory methods return other internal implementations of the abstract class XmlReader depending on the settings passed.


For forward-only streaming API (i.e. that doesn't load the entire thing into memory), use XmlReader via XmlReader.Create method.

For an easier API to work with, go for XDocument aka LINQ To XML. Find XDocument vs XmlDocument here and here.