Find everything between two XML tags with RegEx

In RegEx, I want to find the tag and everything between two XML tags, like the following:

<primaryAddress>
    <addressLine>280 Flinders Mall</addressLine>
    <geoCodeGranularity>PROPERTY</geoCodeGranularity>
    <latitude>-19.261365</latitude>
    <longitude>146.815585</longitude>
    <postcode>4810</postcode>
    <state>QLD</state>
    <suburb>Townsville</suburb>
    <type>PHYSICAL</type>
</primaryAddress>

I want to find the tag and everything between primaryAddress, and erase that.

Everything between the primaryAddress tag is a variable, but I want to remove the entire tag and sub-tags whenever I get primaryAddress.

Anyone have any idea how to do that?


Solution 1:

It is not a good idea to use regex for HTML/XML parsing...

However, if you want to do it anyway, search for regex pattern

<primaryAddress>[\s\S]*?<\/primaryAddress>

and replace it with empty string...

Solution 2:

You should be able to match it with: /<primaryAddress>(.+?)<\/primaryAddress>/

The content between the tags will be in the matched group.

Solution 3:

It is not good to use this method but if you really want to split it with regex

<primaryAddress.*>((.|\n)*?)<\/primaryAddress>

the verified answer returns the tags but this just return the value between tags.

Solution 4:

this can capture most outermost layer pair of tags, even with attribute in side or without end tags

(<!--((?!-->).)*-->|<\w*((?!\/<).)*\/>|<(?<tag>\w+)[^>]*>(?>[^<]|(?R))*<\/\k<tag>\s*>)

edit: as mentioned in comment above, regex is always not enough to parse xml, trying to modify the regex to fit more situation only makes it longer but still useless