What is an empty element?

But since both these parts are optional, it would mean that nothing (as in, absence of characters) matches this production.

That may be true, but the wording in the spec on this issue is quite clear. There are even examples for empty elements in the next paragraph.

<IMG align="left"
 src="http://www.w3.org/Icons/WWW/w3c_home" />
<br></br>
<br/>

So the only way (in this context, with the surrounding wording and examples) to read

An element with no content

would be to include "content that (while matching the production) is completely empty" (i.e. zero-length, not even white-space).


I wanted to check what different variations of "empty" actually are empty.

Variation A

<Santa/>

gives a tree of

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

Variation B

<Santa></Santa>

gives a DOM tree of:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

Variation C

<Santa>Space</Santa>

gives a DOM tree of:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

Variation D

<Santa>Tab</Santa>

gives a DOM tree of:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

Variation E

<Santa>CRLF
</Santa>

gives a DOM tree of:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

All variations of text give the same DOM tree. When a XML document is asked to serialize itself, the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

results in the serialized text:

<?xml version="1.0"?>
<Santa/>

Manually adding an empty text node

I wanted to see what happens if i build the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""
      |- NODE_TEXT #text ""

using the pseudo-code:

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.appendChild(doc.CreateText(""));

When that DOM document is saved to a stream, it comes out as:

<?xml version="1.0"?>
<Santa/>

Even when the element is forced to have a child (i.e. forced to not be empty), the DOM takes it to be empty.

Force text node with whitespace

And then if i make sure to put some whitespace in the TEXT node:

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.appendChild(doc.CreateText(" "));

It comes out as the XML:

<?xml version="1.0" ?>
<Santa> </Santa>

with the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""
      |- NODE_TEXT #text " "

Interesting; it's not round-trippable.

Force a TAB CRLF

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.appendChild(doc.CreateText(TAB+LF+CR));

It comes out as the XML:

<?xml version="1.0"?>
<Santa>TABLF
CR    
</Santa>

with the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""
      |- NODE_TEXT #text "\t\n\n"

Yes, XML converts all CR into LF, and yes, it's not round-trippable. If you parse:

<?xml version="1.0"?>
<Santa>TABLF
CR   
</Santa>

you will get the DOM tree of:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

Setting element.text

Finally we come to what happens if you set an element's text through it's .text property.

Set no text:

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
//santa.text = ""; example where we don't set the text

gives the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""

and the XML:

<?xml version="1.0"?>
<Santa/>

Setting empty text

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.text = ""; //example where we do set the text

gives the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""
      |- NODE_TEXT #text ""

and the XML:

<?xml version="1.0"?>
<Santa/>

Setting single space

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.text = " ";

gives the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""
      |- NODE_TEXT #text " "

and the XML:

<?xml version="1.0"?>
<Santa> </Santa>

Setting more whitepsace

XmlDocument doc = new XmlDocument();
XmlElement santa = doc.appendChild(doc.CreateElement("Santa"));
santa.text = LF+TAB+CR;

gives the DOM tree:

|- NODE_DOCUMENT #document ""
   |- NODE_ELEMENT Santa ""
      |- NODE_TEXT #text "\n\t\n"

and the XML:

<?xml version="1.0"?>  
<Santa>LF
TABLF
</Santa>

So what they told you was true, from a certain point of view.

  • an xml string that contains only whitespace in the element will be empty when parsed
  • an DOM element that contain only whitespace in its text node will render the whitespace when converted to an xml string

<element />

and

<element></element>

are both empty elements. Any productions from standards must be interpreted to have this result.