What is a word boundary in regex?

I'm trying to use regexes to match space-separated numbers. I can't find a precise definition of \b ("word boundary"). I had assumed that -12 would be an "integer word" (matched by \b\-?\d+\b) but it appears that this does not work. I'd be grateful to know of ways of .

[I am using Java regexes in Java 1.6]

Example:

Pattern pattern = Pattern.compile("\\s*\\b\\-?\\d+\\s*");
String plus = " 12 ";
System.out.println(""+pattern.matcher(plus).matches());

String minus = " -12 ";
System.out.println(""+pattern.matcher(minus).matches());

pattern = Pattern.compile("\\s*\\-?\\d+\\s*");
System.out.println(""+pattern.matcher(minus).matches());

This returns:

true
false
true

Solution 1:

A word boundary, in most regex dialects, is a position between \w and \W (non-word char), or at the beginning or end of a string if it begins or ends (respectively) with a word character ([0-9A-Za-z_]).

So, in the string "-12", it would match before the 1 or after the 2. The dash is not a word character.

Solution 2:

In the course of learning regular expression, I was really stuck in the metacharacter which is \b. I indeed didn't comprehend its meaning while I was asking myself "what it is, what it is" repetitively. After some attempts by using the website, I watch out the pink vertical dashes at the every beginning of words and at the end of words. I got it its meaning well at that time. It's now exactly word(\w)-boundary.

My view is merely to immensely understanding-oriented. Logic behind of it should be examined from another answers.

enter image description here

Solution 3:

A word boundary can occur in one of three positions:

  1. Before the first character in the string, if the first character is a word character.
  2. After the last character in the string, if the last character is a word character.
  3. Between two characters in the string, where one is a word character and the other is not a word character.

Word characters are alpha-numeric; a minus sign is not. Taken from Regex Tutorial.

Solution 4:

I would like to explain Alan Moore's answer

A word boundary is a position that is either preceded by a word character and not followed by one or followed by a word character and not preceded by one.

Suppose I have a string "This is a cat, and she's awesome", and I want to replace all occurrences of the letter 'a' only if this letter ('a') exists at the "Boundary of a word",

In other words: the letter a inside 'cat' should not be replaced.

So I'll perform regex (in Python) as

re.sub(r"\ba","e", myString.strip()) //replace a with e

so the output will be

This is a cat and she's awesome ->

This is e cat end she's ewesome //Result

Solution 5:

A word boundary is a position that is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one.