Regex: How to find all html lines with english words from the tags whose content is written in another language
I have this html tags:
<p class="BEBE">着名的文学评论家Love有一些重要的东西来说,关于总是分享胜利的人才,转向他们的起源:</p>
<p class="BEBE">着名的文学评论家 有一些重要的东西来说,关于总是分享胜利的人才,kiss 转向他们的起源:</p>
So I must find all lines with at least one english word from the tags whose content is written in another language (cz - chinesse for example)
But I don't wanna find this: (because doesn't have english words)
<p class="BEBE">某些,真正的经济学,真正预测的是神圣的本质</p>
My regex doesn't work, seems to find all tags:
FIND: <p class="BEBE">.*[^\x00-\x7F]+.*</p>
Or, this regex finds only those html tags that contains only chinesse words, without english.
FIND: <p class="BEBE">+(?!\w+[\x00-\x7F]).*</p>
But I need only those tags that contains at least on english word
You have extra spaces in your regex:
<p class="BEBE">.* [^\x00-\x7F]+ .*</p>
# here ___^ and ___^
remove them:
<p class="BEBE">.*[^\x00-\x7F]+.*</p>
Screenshot:
The Solution, thanks @Toto
FIND: <p class="BEBE">+(\w+[\x00-\x7F]).*</p>
also, If you want to skip the tags that contain <em>
or </em>
FIND: <p class="BEBE">+(?!\w+</em>)+\w+(\w+[\x00-\x7F]).*</p>
or
FIND: <p class="BEBE">+(?!\w+<em>).*(\w+[\x00-\x7F]).*</p>