Regex match character only when NOT preceeded by specific word

The goal is to have regex match all newline character which are not preceded by a 2-decimal number. Here's some example text:

This line ends with text
this line ends with a number: 55
this line ends with a 2-decimal number: 5.00
here's 22.22, not at the end of the line

Regex should match the end of lines 1, 2, and 4 (assuming a newline after the 4th line). I thought negative lookahead was the answer so I tried

(?!\d*\.\d\d)\n

without success as seen in this regex101 snippet: https://regex101.com/r/qbrKlt/4

Edit: I later discovered the reason this didn't work is because Python's Regex doesn't support variable length negative lookahead - it only supports fixed-length negative lookahead.

Unfortunately fixed-length look-ahead still didnt work:

(?!\.\d\d)\n

Instead I did a workaround by running regex twice & subtracting the result:

  1. find all indices of newline characters: \n
  2. find all indices of newline characters preceded by 2-decimal numbers: \d*\.\d\d\n
  3. remove indices found in step 2 from those found in step 1 for the answer

But I'm sure there's a way to do this in 1 go and I'd be grateful to anyone out there that can help in discovering the solution :)


Solution 1:

You need to use a negative lookbehind instead of a negative lookahead:

(?<!\.\d\d)\n

Updated RegEx Demo

This will match \n if that is not immediately preceded by dot and 2 digits.

Solution 2:

Why get esoteric with regexes, when you can just capture the final word using string.split()[-1] and test that for the form you need? Python isn't Perl (fortunately).