How to make Regular expression into non-greedy?
I'm using jQuery. I have a string with a block of special characters (begin and end). I want get the text from that special characters block. I used a regular expression object for in-string finding. But how can I tell jQuery to find multiple results when have two special character or more?
My HTML:
<div id="container">
<div id="textcontainer">
Cuộc chiến pháp lý giữa [|cơ thử|nghiệm|] thị trường [|test2|đây là test lần 2|] chứng khoán [|Mỹ|day la nuoc my|] và ngân hàng đầu tư quyền lực nhất Phố Wall mới chỉ bắt đầu.
</div>
</div>
and my JavaScript code:
$(document).ready(function() {
var takedata = $("#textcontainer").text();
var test = 'abcd adddb';
var filterdata = takedata.match(/(\[.+\])/);
alert(filterdata);
//end write js
});
My result is: [|cơ thử|nghiệm|] thị trường [|test2|đây là test lần 2|] chứng khoán [|Mỹ|day la nuoc my|] . But this isn't the result I want :(. How to get [text] for times 1 and [demo] for times 2 ?
I've just done my work after searching info on internet ^^. I make code like this:
var filterdata = takedata.match(/(\[.*?\])/g);
- my result is : [|cơ thử|nghiệm|],[|test2|đây là test lần 2|] this is right!. but I don't really understand this. Can you answer my why?
The non-greedy regex modifiers are like their greedy counter-parts but with a ?
immediately following them:
* - zero or more
*? - zero or more (non-greedy)
+ - one or more
+? - one or more (non-greedy)
? - zero or one
?? - zero or one (non-greedy)
You are right that greediness is an issue:
--A--Z--A--Z--
^^^^^^^^^^
A.*Z
If you want to match both A--Z
, you'd have to use A.*?Z
(the ?
makes the *
"reluctant", or lazy).
There are sometimes better ways to do this, though, e.g.
A[^Z]*+Z
This uses negated character class and possessive quantifier, to reduce backtracking, and is likely to be more efficient.
In your case, the regex would be:
/(\[[^\]]++\])/
Unfortunately Javascript regex doesn't support possessive quantifier, so you'd just have to do with:
/(\[[^\]]+\])/
See also
-
regular-expressions.info/Repetition
- See: An Alternative to Laziness
- Possessive quantifiers
- Flavors comparison
- See: An Alternative to Laziness
Quick summary
* Zero or more, greedy
*? Zero or more, reluctant
*+ Zero or more, possessive
+ One or more, greedy
+? One or more, reluctant
++ One or more, possessive
? Zero or one, greedy
?? Zero or one, reluctant
?+ Zero or one, possessive
Note that the reluctant and possessive quantifiers are also applicable to the finite repetition {n,m}
constructs.
Examples in Java:
System.out.println("aAoZbAoZc".replaceAll("A.*Z", "!")); // prints "a!c"
System.out.println("aAoZbAoZc".replaceAll("A.*?Z", "!")); // prints "a!b!c"
System.out.println("xxxxxx".replaceAll("x{3,5}", "Y")); // prints "Yx"
System.out.println("xxxxxx".replaceAll("x{3,5}?", "Y")); // prints "YY"
I believe it would be like this
takedata.match(/(\[.+\])/g);
the g
at the end means global, so it doesn't stop at the first match.