regular expression for finding 'href' value of a <a> link

I'd recommend using an HTML parser over a regex, but still here's a regex that will create a capturing group over the value of the href attribute of each links. It will match whether double or single quotes are used.


You can view a full explanation of this regex at here.

Snippet playground:

const linkRx = /<a\s+(?:[^>]*?\s+)?href=(["'])(.*?)\1/;
const textToMatchInput = document.querySelector('[name=textToMatch]');

document.querySelector('button').addEventListener('click', () => {
  Text to match:
  <input type="text" name="textToMatch" value='<a href=""'>

Using regex to parse html is not recommended

regex is used for regularly occurring patterns.html is not regular with it's format(except xhtml).For example html files are valid even if you don't have a closing tag!This could break your code.

Use an html parser like htmlagilitypack

You can use this code to retrieve all href's in anchor tag using HtmlAgilityPack

HtmlDocument doc = new HtmlDocument();

var hrefList = doc.DocumentNode.SelectNodes("//a")
                  .Select(p => p.GetAttributeValue("href", "not found"))

hrefList contains all href`s