How can I store regex captures in an array in Perl?

Solution 1:

If you're doing a global match (/g) then the regex in list context will return all of the captured matches. Simply do:

my @matches = ( $str =~ /pa(tt)ern/g )

This command for example:

perl -le '@m = ( "foo12gfd2bgbg654" =~ /(\d+)/g ); print for @m'

Gives the output:

12
2
654

Solution 2:

See the manual entry for perldoc perlop under "Matching in List Context":

If the /g option is not used, m// in list context returns a list consisting of the subexpressions matched by the parentheses in the pattern, i.e., ($1 , $2 , $3 ...)

The /g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern.

You can simply grab all the matches by assigning to an array, or otherwise performing the evaluation in list context:

my @matches = ($string =~ m/word/g);

Solution 3:

Sometimes you need to get all matches globally, like PHP's preg_match_all does. If it's your case, then you can write something like:

# a dummy example
my $subject = 'Philip Fry Bender Rodriguez Turanga Leela';
my @matches;
push @matches, [$1, $2] while $subject =~ /(\w+) (\w+)/g;

use Data::Dumper;
print Dumper(\@matches);

It prints

$VAR1 = [
          [
            'Philip',
            'Fry'
          ],
          [
            'Bender',
            'Rodriguez'
          ],
          [
            'Turanga',
            'Leela'
          ]
        ];

Solution 4:

I think this is a self-explanatory example. Note /g modifier in the first regex:

$string = "one two three four";

@res = $string =~ m/(\w+)/g;
print Dumper(@res); # @res = ("one", "two", "three", "four")

@res = $string =~ m/(\w+) (\w+)/;
print Dumper(@res); # @res = ("one", "two")

Remember, you need to make sure the lvalue is in the list context, which means you have to surround scalar values with parenthesis:

($one, $two) = $string =~ m/(\w+) (\w+)/;

Solution 5:

Is it possible to store all matches for a regular expression into an array?

Yes, in Perl 5.25.7, the variable @{^CAPTURE} was added, which holds "the contents of the capture buffers, if any, of the last successful pattern match". This means it contains ($1, $2, ...) even if the number of capture groups is unknown.

Before Perl 5.25.7 (since 5.6.0) you could build the same array using @- and @+ as suggested by @Jaques in his answer. You would have to do something like this:

    my @capture = ();
    for (my $i = 1; $i < @+; $i++) {
        push @capture, substr $subject, $-[$i], $+[$i] - $-[$i];
    }