Print the second or nth match of a sed search which is based in between two patterns

I would like to print the nth match of a sed search which is based on two patterns, as shown below:

sed -n '/start here/,/end here/p'  'testfile.txt' 

Let's say that testfile.txt contains the text below:

start here
0000000
0000000
end here
start here
123
1234
12345

123456
end here
start here
00000000
end here
00000000

00000000

and that I do not want to print the zeros between the two patterns.

Based on the command above, I will get all the matches between the patterns, and its output is shown below:

start here
0000000
0000000
end here
start here
123
1234
12345

123456
end here
start here
00000000
end here

While my desired output is:

start here
123
1234
12345

123456
end here

Consider that the lines need to be printed as in testfile.txt and not concatenated.


Solution 1:

I would just switch to another tool. Perl, for example:

perl -ne '$k++ if /Pattern1/; if(/Pattern1/ .. /Pattern2/){print if $k==3}' file

That will print the 3rd match. Change the $k==3 to whatever value you want. The logic is:

  • $k++ if /Pattern1/ : increment the value of the variable $k by one if this line matches Pattern1.
  • if(/Pattern1/ .. /Pattern2/){print if $k==3} : if this line is within the range of /Pattern1/ to /Pattern2/, print it but only if $k is 3. Change this value to whichever match you want.

You could wrap this in a little shell function to be able to get the Nth match more easily:

getNth(){
  pat1="$1"
  pat2="$2"
  n="$3"
  file="$4"

  perl -ne '$k++ if /'"$pat1"'/;if(/'"$pat1"'/ .. /'"$pat2"'/){print if $k=='"$n"'}' file

}

You could then run it like this:

getNth Pattern1 Pattern2 3 'huge file.txt' 

Using your example data:

$ perl -lne '$k++ if /start here/;if(/start here/ .. /end here/){print if $k==2}' testfile.txt
start here
123
1234
12345

123456
end here

Or:

$ getNth 'start here' 'end here' 2 testfile.txt
start here
123
1234
12345

123456
end here

Just for fun, here's another perl approach:

$ perl -lne '($k++,$l++) if /start here/; print if $l && $k==2; $l=0 if /end here/' testfile.txt 
start here
123
1234
12345

123456
end here

Or, if you like golfing (thanks @simlev):

perl -ne 'print if /^start here$/&&++$k==2../^end here$/' testfile.txt 

Solution 2:

I'd solve this with Perl, as @terdon sensibly suggests. Or with AWK:

awk '/start here/&&++k==2,/end here/' testfile.txt

If I had to employ sed alone (as the OP states in a comment) I'd come up with something much more convoluted, less readable and less customizable:

sed -n '/start here/{:A n; /end here/b B; b A}; :B n; /start here/{p; :C n; p; /end here/q; b C}; b B' testfile.txt