Don't need the whole line, just the match from regular expression

Solution 1:

2 Things:

  • As stated by @Rory, you need the -o option, so only the match are printed (instead of whole line)
  • In addition, you neet the -P option, to use Perl regular expressions, which include useful elements like Look ahead (?= ) and Look behind (?<= ), those look for parts, but don't actually match and print them.

If you want only the part inside the parenthesis to be matched, do the following:

grep -oP '(?<=\/\()\w(?=\).+\/)' myfile.txt

If the file contains the sting /(a)5667/, grep will print 'a', because:

  • /( are found by \/\(, but because they are in a look-behind (?<= ) they are not reported
  • a is matched by \w and is thus printed (because of -o )
  • )5667/ are found by \).+\/, but because they are in a look-ahead (?= ) they are not reported

Solution 2:

Use the -o option in grep.

Eg:

$ echo "foobarbaz" | grep -o 'b[aeiou]r'
bar

Solution 3:

    sed -n "s/^.*\(captureThis\).*$/\1/p"

-n      don't print lines
s       substitute
^.*     matches anything before the captureThis 
\( \)   capture everything between and assign it to \1 
.*$     matches anything after the captureThis 
\1      replace everything with captureThis 
p       print it

Solution 4:

Because you tagged your question as bash in addition to shell, there is another solution beside grep :

Bash has its own regular expression engine since version 3.0, using the =~ operator, just like Perl.

now, given the following code:

#!/bin/bash
DATA="test <Lane>8</Lane>"

if [[ "$DATA" =~ \<Lane\>([[:digit:]]+)\<\/Lane\> ]]; then
        echo $BASH_REMATCH
        echo ${BASH_REMATCH[1]}
fi
  • Note that you have to invoke it as bashand not just sh in order to get all extensions
  • $BASH_REMATCH will give the whole string as matched by the whole regular expression, so <Lane>8</Lane>
  • ${BASH_REMATCH[1]} will give the part matched by the 1st group, thus only 8

Solution 5:

If you want only what is in the parenthesis, you need something that supports capturing sub matches (Named or Numbered Capturing Groups). I don't think grep or egrep can do this, perl and sed can. For example, with perl:

If a file called foo has a line in that is as follows:

/adsdds      /

And you do:

perl -nle 'print $1 if /\/(\w).+\//' foo

The letter a is returned. That might be not what you want though. If you tell us what you are trying to match, you might get better help. $1 is whatever was captured in the first set of parenthesis. $2 would be the second set etc.