Regex for partial path
I have paths like these (single lines):
/
/abc
/def/
/ghi/jkl
/mno/pqr/
/stu/vwx/yz
/abc/def/ghi/jkl
I just need patterns that match up to the third "/". In other words, paths containing just "/" and up to the first 2 directories. However, some of my directories end with a "/" and some don't. So the result I want is:
/
/abc
/def/
/ghi/jkl
/mno/pqr/
/stu/vwx/
/abc/def/
So far, I've tried (\/|.*\/)
but this doesn't get the path ending without a "/".
I would recommend this pattern:
/^(\/[^\/]+){0,2}\/?$/gm
DEMO
It works like this:
-
^
searches for the beginning of a line -
(\/[^\/]+)
searches for a path element-
(
starts a group -
\/
searches for a slash -
[^\/]+
searches for some non-slash characters
-
-
{0,2}
says, that 0 to 2 of those path elements should be found -
\/?
allows trailling slashes -
$
searches for the end of the line
Use these modifiers:
-
g
to search for several matches within the input -
m
to treat every line as a separate input
You need a pattern like ^(\/\w+){0,2}\/?$
, it checks that you have (/
and name) no more than 2 times and that it can end with /
Details :
-
^
: beginning of the string -
(\/\w+)
: slash (escaped) and word-char, all in a group -
{0,2}
the group can be 0/1/2 times -
\/?
: slash (escaped) can be 0 or 1 time
Online DEMO Regex DEMO
Your regex (\/|.*\/)
uses an alternation which matches either a forward slash or any characters 0+ times greedy followed by matching a forward slash.
So in for example /ghi/jkl
, the first match will be the first forward slash. Then this part .*
of the next pattern will match from the first g
until the end of the string. The engine will backtrack to last forward slash to fullfill the whole .*\/
pattern.
The trailing jkl
can not be matched anymore by neither patterns of the alternation.
Note that you don't have to escape the forward slash.
You could use:
^/(?:\w+/?){0,2}$
In Java:
String regex = "^/(?:\\w+/?){0,2}$";
Regex demo
Explanation
-
^
Start of the string -
/
Match forward slash -
(?:
Non capturing group-
\w+
Match 1+ word characters (If you want to match more than\w
you could use a character class and add to that what you want match) -
/?
Match optional forward slash
-
-
){0,2}
Close non capturing group and repeat 0 - 2 times -
$
End of the string
^(/([^/]+){0,2}\/?)$
To break it down
-
^
is the start of the string -
{0,2}
means repeat the previous between 0 and 2 times. -
Then it ends with an optional slash by using a
?
-
String end is
$
so it doesn't match longer strings. -
()
Around the whole thing to capture it.
But I'll point out that the is almost always the wrong answer for directory matching. Some directories have special meaning, like /../.. which actually goes up two directories, not down. Better to use the systems directory API instead for more robust results.