Extract string into columns based on regex [duplicate]
This is an example string:
123456#p654321
Currently, I am using this match to capture 123456
and 654321
in to two different groups:
([0-9].*)#p([0-9].*)
But on occasions, the #p654321
part of the string will not be there, so I will only want to capture the first group. I tried to make the second group "optional" by appending ?
to it, which works, but only as long as there is a #p
at the end of the remaining string.
What would be the best way to solve this problem?
You have the #p
outside of the capturing group, which makes it a required piece of the result. You are also using the dot character (.
) improperly. Dot (in most reg-ex variants) will match any character. Change it to:
([0-9]*)(?:#p([0-9]*))?
The (?:)
syntax is how you get a non-capturing group. We then capture just the digits that you're interested in. Finally, we make the whole thing optional.
Also, most reg-ex variants have a \d
character class for digits. So you could simplify even further:
(\d*)(?:#p(\d*))?
As another person has pointed out, the *
operator could potentially match zero digits. To prevent this, use the +
operator instead:
(\d+)(?:#p(\d+))?
Your regex will actually match no digits, because you've used *
instead of +
.
This is what (I think) you want:
(\d+)(?:#p(\d+))?