Negative lookbehind equivalent in JavaScript

Is there a way to achieve the equivalent of a negative lookbehind in JavaScript regular expressions? I need to match a string that does not start with a specific set of characters.

It seems I am unable to find a regex that does this without failing if the matched part is found at the beginning of the string. Negative lookbehinds seem to be the only answer, but JavaScript doesn't has one.

This is the regex that I would like to work, but it doesn't:

(?<!([abcdefg]))m

So it would match the 'm' in 'jim' or 'm', but not 'jam'


Solution 1:

Since 2018, Lookbehind Assertions are part of the ECMAScript language specification.

// positive lookbehind
(?<=...)
// negative lookbehind
(?<!...)

Answer pre-2018

As Javascript supports negative lookahead, one way to do it is:

  1. reverse the input string

  2. match with a reversed regex

  3. reverse and reformat the matches


const reverse = s => s.split('').reverse().join('');

const test = (stringToTests, reversedRegexp) => stringToTests
  .map(reverse)
  .forEach((s,i) => {
    const match = reversedRegexp.test(s);
    console.log(stringToTests[i], match, 'token:', match ? reverse(reversedRegexp.exec(s)[0]) : 'Ø');
  });

Example 1:

Following @andrew-ensley's question:

test(['jim', 'm', 'jam'], /m(?!([abcdefg]))/)

Outputs:

jim true token: m
m true token: m
jam false token: Ø

Example 2:

Following @neaumusic comment (match max-height but not line-height, the token being height):

test(['max-height', 'line-height'], /thgieh(?!(-enil))/)

Outputs:

max-height true token: height
line-height false token: Ø

Solution 2:

Lookbehind Assertions got accepted into the ECMAScript specification in 2018.

Positive lookbehind usage:

console.log(
  "$9.99  €8.47".match(/(?<=\$)\d+\.\d*/) // Matches "9.99"
);

Negative lookbehind usage:

console.log(
  "$9.99  €8.47".match(/(?<!\$)\d+\.\d*/) // Matches "8.47"
);

Platform support:

  • ✔️ V8
    • ✔️ Google Chrome 62.0
    • ✔️ Microsoft Edge 79.0
    • ✔️ Node.js 6.0 behind a flag and 9.0 without a flag
    • ✔️ Deno (all versions)
  • ✔️ SpiderMonkey
    • ✔️ Mozilla Firefox 78.0
  • 🛠️ JavaScriptCore: Apple is working on it
    • 🛠️ Apple Safari
    • 🛠️ iOS WebView (all browsers on iOS + iPadOS)
  • ❌ Chakra: Microsoft was working on it but Chakra is now abandoned in favor of V8
    • ❌ Internet Explorer
    • ❌ Edge versions prior to 79 (the ones based on EdgeHTML+Chakra)

Solution 3:

Let's suppose you want to find all int not preceded by unsigned:

With support for negative look-behind:

(?<!unsigned )int

Without support for negative look-behind:

((?!unsigned ).{9}|^.{0,8})int

Basically idea is to grab n preceding characters and exclude match with negative look-ahead, but also match the cases where there's no preceeding n characters. (where n is length of look-behind).

So the regex in question:

(?<!([abcdefg]))m

would translate to:

((?!([abcdefg])).|^)m

You might need to play with capturing groups to find exact spot of the string that interests you or you want to replace specific part with something else.

Solution 4:

Mijoja's strategy works for your specific case but not in general:

js>newString = "Fall ball bill balll llama".replace(/(ba)?ll/g,
   function($0,$1){ return $1?$0:"[match]";});
Fa[match] ball bi[match] balll [match]ama

Here's an example where the goal is to match a double-l but not if it is preceded by "ba". Note the word "balll" -- true lookbehind should have suppressed the first 2 l's but matched the 2nd pair. But by matching the first 2 l's and then ignoring that match as a false positive, the regexp engine proceeds from the end of that match, and ignores any characters within the false positive.

Solution 5:

Use

newString = string.replace(/([abcdefg])?m/, function($0,$1){ return $1?$0:'m';});