Count lines between "X"s
I want to count the lines between "X"s. This is just an example; I have to apply the code to a complex biological result. I will be thankful if you can suggest some command, preferably using awk
, grep
or sed
as I am familiar with those.
Example:
X
Y
Y
Y
X
Y
Y
Y
Y
X
Y
X
Desired Output:
3
4
1
With awk
:
$ awk '!/X/{count++}/X/{print count; count = 0}' input
3
4
1
Increment a count for every line not containing X
; print and reset the count for lines containing X
.
$ awk '/X/ && prev{print NR-prev-1} /X/{prev=NR}' file
3
4
1
How it works:
Awk implicitly reads through input files line by line.
-
/X/ && prev{print NR-prev-1}
For any line that contains
X
and if we have previously assigned a value toprev
, then print out the number of the current line,NR
, minusprev
minus one. -
/X/{prev=NR}
For any line that contains
X
, set the variableprev
to the current line number,NR
.