How to extract (First match)text between two words

Solution 1:

1st solution: With your shown samples, please try following GNU awk code. Using match function of awk program here to match regex rom station\s+\S+\s+to point to get requested value by OP then removing from station\s+ and \s+to point from matched value and printing required value.

awk '
match($0,/from station\s+\S+\s+to point/){
  val=substr($0,RSTART,RLENGTH)
  gsub(/from station\s+|\s+to point/,"",val)
  print val
  exit
}
' Input_file


2nd solution: Using GNU grep please try following. Using -oP option to print matched portion and enabling PCRE regex respectively here. Then in main grep program matching string from station followed by space(s) then using \K option will make sure matched part before \K is forgotten(since e don't need this in output), Then matching \S+(non space values) followed by space(s) to point string(using positive look ahead here to make sure it only checks its present or not but doesn't print that).

grep -oP -m1 'from station\s+\K\S+(?=\s+to point)' Input_file

Solution 2:

If GNU sed is available, how about:

id=$(sed -nE '0,/from station.*to/ s/.*from station (.*) to.*/\1/p' input.txt)
  • The -n option suppress the print unless the substitution succeeds.
  • The condition 0,/pattern/ is a flip-flop operator and it returns false after the pattern match succeeds. The 0 address is a GNU sed extension which makes the 1st line to match against the pattern.

Solution 3:

With awk you can write the before and after conditions of field $4, where d-435-435 is, and then print this field only the first match and exit with exit after print statement:

awk '$2=="from" && $3=="station" && $5=="to" && $6=="point" {print $4; exit}' file
d-435-435

or using GNU awk for the 3rd arg to match():

awk 'match($0,/from station\s+(.*)\s+to point/,a){print a[1];exit}' file
d-435-435
  • The regexp contains a parenthesis, so the integer-indexed element of array a[1] contain the portion of string between from station followed by space(s) \s+ and space(s) \s+ followed byto point.