Split the sentences by ',' and remove surrounding spaces
I have this code:
var r = /(?:^\s*([^\s]*)\s*)(?:,\s*([^\s]*)\s*){0,}$/
var s = " a , b , c "
var m = s.match(r)
m => [" a , b , c ", "a", "c"]
Looks like the whole string has been matched, but where has "b"
gone? I would rather expect to get:
[" a , b , c ", "a", "b", "c"]
so that I can do m.shift()
with a result like s.split(',')
but also with whitespaces removed.
Do I have a mistake in the regexp or do I misunderstand String.prototype.match
?
Solution 1:
Here's a pretty simple & straightforward way to do this without needing a complex regular expression.
var str = " a , b , c "
var arr = str.split(",").map(function(item) {
return item.trim();
});
//arr = ["a", "b", "c"]
The native .map
is supported on IE9 and up: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map
Or in ES6+ it gets even shorter:
var arr = str.split(",").map(item => item.trim());
And for completion, here it is in Typescript with typing information
var arr: string[] = str.split(",").map((item: string) => item.trim());
Solution 2:
ES6 shorthand:
str.split(',').map(item=>item.trim())
Solution 3:
You can try this without complex regular expressions.
var arr = " a , b , c ".trim().split(/\s*,\s*/);
console.log(arr);
Solution 4:
Short answer: Use m = s.match(/[^ ,]/g);
Your RE doesn't work as expected, because the last group matches the most recent match (=
c
). If you omit {1,}$
, the returned match will be " a , b ", "a", "b"
. In short, your RegExp does return as much matches as specified groups unless you use a global
flag /g
. In this case, the returned list hold references to all matched substrings.
To achieve your effect, use:
m = s.replace(/\s*(,|^|$)\s*/g, "$1");
This replace replaces every comma (,
), beginning (^
) and end ($
), surrounded by whitespace, by the original character (comma
, or nothing).
If you want to get an array, use:
m = s.replace(/^\s+|\s+$/g,"").split(/\s*,\s*/);
This RE trims the string (removes all whitespace at the beginning and end, then splits the string by <any whitespace>,<any whitespace>
. Note that white-space characters also include newlines and tabs. If you want to stick to spaces-only, use a space () instead of
\s
.
Solution 5:
You can do this for your purpose
EDIT: Removing second replace as suggested in the comments.
s.replace(/^\s*|\s*$/g,'').split(/\s*,\s*/)
First replace
trims the string and then the split
function splits around '\s*,\s*'
. This gives output ["a", "b", "c"]
on input " a , b , c "
As for why your regex is not capturing 'b', you are repeating a captured group, so only the last occurrence gets captured. More on that here http://www.regular-expressions.info/captureall.html