Parse All Links That Contain A Specific Word In "href" Tag [duplicate]

Solution 1:

By using a condition.

<?php 
$lookfor='/link:';

foreach ($urls as $url){
    if(substr($url->getAttribute('href'),0,strlen($lookfor))==$lookfor){
        echo "<br> ".$url->getAttribute('href')." , ".$url->getAttribute('title');
        echo "<hr><br>";
    }
}
?>

Solution 2:

Instead of first fetching all the a elements and then filtering out the ones you need you can query your document for those nodes directly by using XPath:

//a[contains(@href, "link:")]

This query will find all a elements in the document which contain the string link: in the href attribute.

To check whether the href attribute starts with link: you can do

//a[starts-with(@href, "link:")]

Full example (demo):

$dom = new DomDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
foreach ($xpath->query('//a[contains(@href, "link:")]') as $a) {
    echo $a->getAttribute('href'), PHP_EOL;
}

Please also see

  • Implementing condition in XPath
  • excluding URLs from path links?
  • PHP/XPath: find text node that "starts with" a particular string?
  • PHP Xpath : get all href values that contain needle

for related questions.

Note: marking this CW because of the many related questions