How to extract a value from a string using regex and a shell?

I am in shell and I have this string: 12 BBQ ,45 rofl, 89 lol

Using the regexp: \d+ (?=rofl), I want 45 as a result.

Is it correct to use regex to extract data from a string? The best I have done is to highlight the value in some of the online regex editor. Most of the time it remove the value from my string.

I am investigating expr, but all I get is syntax errors.

How can I manage to extract 45 in a shell script?


Solution 1:

You can do this with GNU grep's perl mode:

echo "12 BBQ ,45 rofl, 89 lol" | grep -P '\d+ (?=rofl)' -o
echo "12 BBQ ,45 rofl, 89 lol" | grep --perl-regexp '\d+ (?=rofl)' --only-matching

-P and --perl-regexp mean Perl-style regular expression. -o and --only-matching mean to output only the matching text.

Solution 2:

Yes regex can certainly be used to extract part of a string. Unfortunately different flavours of *nix and different tools use slightly different Regex variants.

This sed command should work on most flavours (Tested on OS/X and Redhat)

echo '12 BBQ ,45 rofl, 89 lol' | sed  's/^.*,\([0-9][0-9]*\).*$/\1/g'

Solution 3:

It seems that you are asking multiple things. To answer them:

  • Yes, it is ok to extract data from a string using regular expressions, that's what they're there for
  • You get errors, which one and what shell tool do you use?
  • You can extract the numbers by catching them in capturing parentheses:

    .*(\d+) rofl.*
    

    and using $1 to get the string out (.* is for "the rest before and after on the same line)

With sed as example, the idea becomes this to replace all strings in a file with only the matching number:

sed -e 's/.*(\d+) rofl.*/$1/g' inputFileName > outputFileName

or:

echo "12 BBQ ,45 rofl, 89 lol" | sed -e 's/.*(\d+) rofl.*/$1/g'

Solution 4:

you can use the shell(bash for example)

$ string="12 BBQ ,45 rofl, 89 lol"
$ echo ${string% rofl*}
12 BBQ ,45
$ string=${string% rofl*}
$ echo ${string##*,}
45