How to get the captured groups from Select-String?

I'm trying to extract text from a set of files on Windows using the Powershell (version 4):

PS > Select-String -AllMatches -Pattern <mypattern-with(capture)> -Path file.jsp | Format-Table

So far, so good. That gives a nice set of MatchInfo objects:

IgnoreCase                    LineNumber Line                          Filename                      Pattern                       Matches
----------                    ---------- ----                          --------                      -------                       -------
    True                            30   ...                           file.jsp                      ...                           {...}

Next, I see that the captures are in the matches member, so I take them out:

PS > Select-String -AllMatches -Pattern <mypattern-with(capture)> -Path file.jsp | ForEach-Object -MemberName Matches | Format-Table

Which gives:

Groups        Success Captures                 Index     Length Value
------        ------- --------                 -----     ------ -----
{...}         True    {...}                    49        47     ...

or as list with | Format-List:

Groups   : {matched text, captured group}
Success  : True
Captures : {matched text}
Index    : 39
Length   : 33
Value    : matched text

Here's where I stop, I have no idea how to go further and obtain a list of captured group elements.

I've tried adding another | ForEach-Object -MemberName Groups, but it seems to return the same as the above.

The closest I get is with | Select-Object -Property Groups, which indeed gives me what I'd expect (a list of sets):

Groups
------
{matched text, captured group}
{matched text, captured group}
...

But then I'm unable to extract the captured group from each of them, I tried with | Select-Object -Index 1 I get only one of those sets.


Update: a possible solution

It seems that by adding | ForEach-Object { $_.Groups.Groups[1].Value } I got what I was looking for, but I don't understand why - so I can't be sure I would be able to get the right result when extending this method to whole sets of files.

Why is it working?

As a side note, this | ForEach-Object { $_.Groups[1].Value } (i.e. without the second .Groups) gives the same result.

I'd like to add that, upon further attempts, it seems the command can be shortened by removing the piped | Select-Object -Property Groups.


Have a look at the following

$a = "http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$' 

$a is now a MatchInfo ($a.gettype()) it contain a Matches property.

PS ps:\> $a.Matches
Groups   : {http://192.168.3.114:8080/compierews/, 192.168.3.114, compierews}
Success  : True
Captures : {http://192.168.3.114:8080/compierews/}
Index    : 0
Length   : 37
Value    : http://192.168.3.114:8080/compierews/

in the groups member you'll find what you are looking for so you can write :

"http://192.168.3.114:8080/compierews/" | Select-String -Pattern '^http://(.*):8080/(.*)/$'  | % {"IP is $($_.matches.groups[1]) and path is $($_.matches.groups[2])"}

IP is 192.168.3.114 and path is compierews

According to the powershell docs on Regular Expressions > Groups, Captures, and Substitutions:

When using the -match operator, powershell will create an automatic variable named $Matches

PS> "The last logged on user was CONTOSO\jsmith" -match "(.+was )(.+)"

The value returned from this expression is just true|false, but PS will add the $Matches hashtable

So if you output $Matches, you'll get all capture groups:

PS> $Matches

Name     Value
----     -----
2        CONTOSO\jsmith
1        The last logged on user was
0        The last logged on user was CONTOSO\jsmith

And you can access each capture group individually with dot notation like this:

PS> "The last logged on user was CONTOSO\jsmith" -match "(.+was )(.+)"
PS> $Matches.2
CONTOSO\jsmith

Additional Resources:

  • To Get Multiple Matches, see How to capture multiple regex matches
  • To Pass Options/Flags, see Pass regex options to PowerShell [regex] type

Late answer, but to loop multiple matches and groups I use:

$pattern = "Login:\s*([^\s]+)\s*Password:\s*([^\s]+)\s*"
$matches = [regex]::Matches($input_string, $pattern)

foreach ($match in $matches)
{
    Write-Host  $match.Groups[1].Value
    Write-Host  $match.Groups[2].Value
}

This worked for my situation.

Using the file: test.txt

// autogenerated by script
char VERSION[21] = "ABCDEFGHIJKLMNOPQRST";
char NUMBER[16] = "123456789012345";

Get the NUMBER and VERSION from the file.

PS C:\> Select-String -Path test.txt -Pattern 'VERSION\[\d+\]\s=\s\"(.*)\"' | %{$_.Matches.Groups[
1].value}

ABCDEFGHIJKLMNOPQRST

PS C:\> Select-String -Path test.txt -Pattern 'NUMBER\[\d+\]\s=\s\"(.*)\"' | %{$_.Matches.Groups[1
].value}

123456789012345