Regex for attribute value having quotes in between same as the enclosing quotes

You can convert the " in the attribute value to " and then it is easier to use a dom parser to get the alt values:

$text = 'advcd<img loading="lazy" class="abcd pqr" alt="chi-phi-sinh-o-benh-v&quot;ien-dai-hoc-y-duoc-co-so-2" attr="val"><img loading="lazy" class="abcd pqr" alt="abcd-sinh-o-benh-&quot;ien-dai-hoc-y-duoc-co-so-3">sdfs';
$dom = new DOMDocument();
$dom->loadHTML($text, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXpath($dom);

foreach($xpath->evaluate("//img/@alt") as $i) {
    echo $i->nodeValue . PHP_EOL;   
}

Output

chi-phi-sinh-o-benh-v"ien-dai-hoc-y-duoc-co-so-2
abcd-sinh-o-benh-"ien-dai-hoc-y-duoc-co-so-3

Using a regex for your examples strings:

  • (alt)= Capture group 1, match alt followed by =
  • ( Capture group 2
    • ".*?" match from " and then the least amount of characters till the next "
    • (?= Positive lookahead
      • \s* Match optional whitespace chars
      • (?:[^\s=]+="|>) Match either non whitespace chars except the = until you match the = and " OR match >
    • ) Close lookahead
  • ) Close group 2

Php demo | regex demo

$text = 'advcd<img loading="lazy" class="abcd pqr" alt="chi-phi-sinh-o-benh-v"ien-dai-hoc-y-duoc-co-so-2" attr="val"><img loading="lazy" class="abcd pqr" alt="abcd-sinh-o-benh-"ien-dai-hoc-y-duoc-co-so-3">sdfs';

preg_match_all('/(alt)=(".*?"(?=\s*(?:[^\s=]+="|>)))/i', $text, $matches);

if (count($matches) > 1) {
    print_r($matches);
}

Output

Array
(
    [0] => Array
        (
            [0] => alt="chi-phi-sinh-o-benh-v"ien-dai-hoc-y-duoc-co-so-2"
            [1] => alt="abcd-sinh-o-benh-"ien-dai-hoc-y-duoc-co-so-3"
        )

    [1] => Array
        (
            [0] => alt
            [1] => alt
        )

    [2] => Array
        (
            [0] => "chi-phi-sinh-o-benh-v"ien-dai-hoc-y-duoc-co-so-2"
            [1] => "abcd-sinh-o-benh-"ien-dai-hoc-y-duoc-co-so-3"
        )

)

It seems the structure is wrong and before " the \ should be added. But the following regex leads to a solution.

(alt)=((["\']).*?[^\\]\3)(?:\s|>)

\3: matches to 3rd match group. It is used because the value should end with the same sign that started with (" or ').

[^\\]\3: Before the end quotation sign, \ is escaped the closing.

(?:\s|>) after " or ' a space or '>' is required.

https://www.phpliveregex.com/p/DmU