How to capture text between blackslashes and " character with RegEx (Java)? [duplicate]

Solution 1:

You need to make your regular expression lazy/non-greedy, because by default, "(.*)" will match all of "file path/level1/level2" xxx some="xxx".

Instead you can make your dot-star non-greedy, which will make it match as few characters as possible:

/location="(.*?)"/

Adding a ? on a quantifier (?, * or +) makes it non-greedy.

Note: this is only available in regex engines which implement the Perl 5 extensions (Java, Ruby, Python, etc) but not in "traditional" regex engines (including JavaScript, Awk, sed, grep without -P, etc.).

Solution 2:

location="(.*)" will match from the " after location= until the " after some="xxx unless you make it non-greedy.

So you either need .*? (i.e. make it non-greedy by adding ?) or better replace .* with [^"]*.

  • [^"] Matches any character except for a " <quotation-mark>
  • More generic: [^abc] - Matches any character except for an a, b or c

Solution 3:

How about

.*location="([^"]*)".*

This avoids the unlimited search with .* and will match exactly to the first quote.

Solution 4:

Use non-greedy matching, if your engine supports it. Add the ? inside the capture.

/location="(.*?)"/