My regular expression matches too much. How can I tell it to match the smallest possible pattern? [duplicate]

I have this RegEx:

('.+')

It has to match character literals like in C. For example, if I have 'a' b 'a' it should match the a's and the ''s around them.

However, it also matches the b also (it should not), probably because it is, strictly speaking, also between ''s.

Here is a screenshot of how it goes wrong (I use this for syntax highlighting):
screenshot

I'm fairly new to regular expressions. How can I tell the regex not to match this?


Solution 1:

It is being greedy and matching the first apostrophe and the last one and everything in between.

This should match anything that isn't an apostrophe.

('[^']+')

Another alternative is to try non-greedy matches.

('.+?')

Solution 2:

Have you tried a non-greedy version, e.g. ('.+?')?

There are usually two modes of matching (or two sets of quantifiers), maximal (greedy) and minimal (non-greedy). The first will result in the longest possible match, the latter in the shortest. You can read about it (although in perl context) in the Perl Cookbook (Section 6.15).

Solution 3:

Try:

('[^']+')

The ^ means include every character except the ones in the square brackets. This way, it won't match 'a' b 'a' because there's a ' in between, so instead it'll give both instances of 'a'