What's so bad about building XML with string concatenation?

You can end up with invalid XML, but you will not find out until you parse it again - and then it is too late. I learned this the hard way.


I think readability, flexibility and scalability are important factors. Consider the following piece of Linq-to-Xml:

XDocument doc = new XDocument(new XDeclaration("1.0","UTF-8","yes"),
   new XElement("products", from p in collection
    select new XElement("product",
        new XAttribute("guid", p.ProductId), 
        new XAttribute("title", p.Title),
        new XAttribute("version", p.Version))));

Can you find a way to do it easier than this? I can output it to a browser, save it to a document, add attributes/elements in seconds and so on ... just by adding couple lines of code. I can do practically everything with it without much of effort.


Actually, I find the biggest problem with string concatenation is not getting it right the first time, but rather keeping it right during code maintenance. All too often, a perfectly-written piece of XML using string concat is updated to meet a new requirement, and string concat code is just too brittle.

As long as the alternatives were XML serialization and XmlDocument, I could see the simplicity argument in favor of string concat. However, ever since XDocument et. al., there is just no reason to use string concat to build XML anymore. See Sander's answer for the best way to write XML.

Another benefit of XDocument is that XML is actually a rather complex standard, and most programmers simply do not understand it. I'm currently dealing with a person who sends me "XML", complete with unquoted attribute values, missing end tags, improper case sensitivity, and incorrect escaping. But because IE accepts it (as HTML), it must be right! Sigh... Anyway, the point is that string concatenation lets you write anything, but XDocument will force standards-complying XML.


I wrote a blog entry back in 2006 moaning about XML generated by string concatenation; the simple point is that if an XML document fails to validate (encoding issues, namespace issues and so on) it is not XML and cannot be treated as such.

I have seen multiple problems with XML documents that can be directly attributed to generating XML documents by hand using string concatenation, and nearly always around the correct use of encoding.

Ask yourself this; what character set am I currently encoding my document with ('ascii7', 'ibm850', 'iso-8859-1' etc)? What will happen if I write a UTF-16 string value into an XML document that has been manually declared as 'ibm850'?

Given the richness of the XML support in .NET with XmlDocument and now especially with XDocument, there would have to be a seriously compelling argument for not using these libraries over basic string concatenation IMHO.


I think that the problem is that you aren't watching the xml file as a logical data storage thing, but as a simple textfile where you write strings.

It's obvious that those libraries do string manipulation for you, but reading/writing xml should be something similar to saving datas into a database or something logically similar