PHP DOM textContent vs nodeValue?
PHP DOMnode objects contain a textContent and nodeValue attributes which both seem to be the innerHTML of the node.
nodeValue: The value of this node, depending on its type
textContent: This attribute returns the text content of this node and its descendants.
What is the difference between these two properties? When is it proper to use one instead of the other?
Solution 1:
I finally wanted to know the difference as well, so I dug into the source and found the answer; in most cases there will be no discernible difference, but there are a bunch of edge cases you should be aware of.
Both ->nodeValue
and ->textContent
are identical for the following classes (node types):
DOMAttr
DOMText
DOMElement
DOMComment
DOMCharacterData
DOMProcessingInstruction
The ->nodeValue
property yields NULL
for the following classes (node types):
DOMDocumentFragment
DOMDocument
DOMNotation
DOMEntity
DOMEntityReference
The ->textContent
property is non-existent for the following classes:
-
DOMNameSpaceNode
(not documented, but can be found with//namespace:*
selector)
The ->nodeValue
property is non-existent for the following classes:
DOMDocumentType
See also: dom_node_node_value_read()
and dom_node_text_content_read()
Solution 2:
Hope this will make sense:
$doc = DOMDocument::loadXML('<body><!-- test --><node attr="test1">old content<h1>test</h1></node></body>');
var_dump($doc->textContent);
var_dump($doc->nodeValue);
var_dump($doc->firstChild->textContent);
var_dump($doc->firstChild->nodeValue);
Output:
string(15) "old contenttest"
NULL
string(15) "old contenttest"
string(15) "old contenttest"
Because: nodeValue - The value of this node, depending on its type