DOMXpath - Get href attribute and text value of an a element
So I have a HTML string like this:
<td class="name">
<a href="/blah/somename23123">Some Name</a>
</td>
<td class="name">
<a href="/blah/somename28787">Some Name2</a>
</td>
Using XPath I'm able to get value of href attribute using this Xpath query:
$domXpath = new \DOMXPath($this->domPage);
$hrefs = $domXpath->query("//td[@class='name']/a/@href");
foreach($hrefs as $href) {...}
And It's even easier to get a text value, like this:
// Xpath auto. strips any html tags so we are
// left with clean text value of a element
$domXpath = new \DOMXPath($this->domPage);
$names = $domXpath->query("//td[@class='name']/");
foreach($names as $name) {...}
Now I'm curious to know, how can I combine those two queries to get both values with only one query (If it's something like that even posible?).
Solution 1:
Fetch
//td[@class='name']/a
and then pluck the text with nodeValue
and the attribute with getAttribute('href')
.
Apart from that, you can combine Xpath queries with the Union Operator |
so you can use
//td[@class='name']/a/@href|//td[@class='name']
as well.
Solution 2:
To reduce the code to a single loop, try:
$anchors = $domXpath->query("//td[@class='name']/a");
foreach($anchors as $a)
{
print $a->nodeValue." - ".$a->getAttribute("href")."<br/>";
}
As per above :) Too slow ..
Solution 3:
Simplest way, evaluate
is for this task!
The simplest way to obtain a value is by evaluate()
method:
$xp = new DOMXPath($dom);
$v = $xp->evaluate("string(/etc[1]/@stringValue)");
Note: important to limit XPath returns to 1 item (the first a
in this case), and cast the value with string()
or round()
, etc.
So, in a set of multiple items, using your foreach
code,
$names = $domXpath->query("//td[@class='name']/");
foreach($names as $contextNode) {
$text = $domXpath->evaluate("string(./a[1])",$contextNode);
$href = $domXpath->evaluate("string(./a[1]/@href)",$contextNode);
}
PS: this example is only for evaluate
's illustration... When the information already exists at the node, use what offers best performance, as methods getAttribute()
, saveXML()
, etc. and properties as $nodeValue
, $textContent
, etc. supplied by DOMNode
.
See @Gordon's answer for this particular problem.
The XPath subquery (at context) is good for complex cases — or symplify your code, avoiding to check hasChildNodes() + loop for $childNodes, etc. with no significative gain in performance.