Regex: Select everything up to | but not | (between <title></title>)

I have this example:

<title>Square Meters | Dragon White (en)</title>

I want to use regex as to select everything up to | but not | (between ...)

My 2 regex selects also the |, this is why I need a better formula, without that |

SEARCH: \w+.*\| or \w+.*?[\s\S]\|

This is the line from my Python code, with the regex I must change a little bit:

words = re.findall(r'\w+', new_filename)

Right now the result is square-meters-dragon-white-en.html

But the expected result should be: square-meters.html

This is the part with python code:

new_filename = title.get_text() 
new_filename = new_filename.lower()
words = re.findall(r'\w+', new_filename)
new_filename = '-'.join(words)
new_filename = new_filename + '.html'
print(new_filename)

I get very close, if I change this way the regex: (?=\w+).*(?= \|)

words = re.findall(r'(?=\w+).*(?= \|)', new_filename)

and I get: square meters.html (but without little dash)


Use simply: [^|]+ # 1 or more any character that is not a pipe, this also selects linebreak.

If you don't want to select linebreak, use: [^|\r\n]+.

This will work in any text editor that support regex.