Strip Tags and everything in between

php

As you’re dealing with HTML, you should use an HTML parser to process it correctly. You can use PHP’s DOMDocument and query the elements with DOMXPath, e.g.:

$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
foreach ($xpath->query('//h1') as $node) {
    $node->parentNode->removeChild($node);
}
$html = $doc->saveHTML();

Try this:

preg_replace('/<h1[^>]*>([\s\S]*?)<\/h1[^>]*>/', '', '<h1>including this content</h1>');

Example:

echo preg_replace('/<h1[^>]*>([\s\S]*?)<\/h1[^>]*>/', '', 'Hello<h1>including this content</h1> There !!');

Output:

Hello There

If you want to strip ALL tags and including content:

$yourString = 'Hello <div>Planet</div> Earth. This is some <span class="foo">sample</span> content!';
$regex = '/<[^>]*>[^<]*<[^>]*>/';
echo preg_replace($regex, '', $yourString);
#=> Hello  Earth. This is some  content!

HTML attributes can contain < or >. So, if your HTML gets too messy this method will not work and you'll need a DOM parser.

Regular Expression Explanation

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  <                        '<'
--------------------------------------------------------------------------------
  [^>]*                    any character except: '>' (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  >                        '>'
--------------------------------------------------------------------------------
  [^<]*                    any character except: '<' (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  <                        '<'
--------------------------------------------------------------------------------
  [^>]*                    any character except: '>' (0 or more times
                           (matching the most amount possible))
--------------------------------------------------------------------------------
  >                        '>'

Strip Tags and everything in between

Regular Expression Explanation

Related

Recent Posts