Create counter of consecutive runs of a certain value
I have data where consecutive runs of zero are separated by runs of non-zero values. I want to create a counter for the runs of zero in the column 'SOG'.
For the first sequence of 0 in SOG, set the counter in column Stops to 1. For the second run of zeros, set 'Stops' to 2, and so on.
SOG Stops
--- -----
4 0
4 0
0 1
0 1
0 1
3 0
4 0
5 0
0 2
0 2
1 0
2 0
0 3
0 3
0 3
Solution 1:
SOG <- c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0)
#run length encoding:
tmp <- rle(SOG)
#turn values into logicals
tmp$values <- tmp$values == 0
#cumulative sum of TRUE values
tmp$values[tmp$values] <- cumsum(tmp$values[tmp$values])
#inverse the run length encoding
inverse.rle(tmp)
#[1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3
Solution 2:
Try
df$stops<- with(df, cumsum(c(0, diff(!SOG))>0)*!SOG)
df$stops
# [1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3
Solution 3:
Using dplyr
:
library(dplyr)
df <- df %>% mutate(Stops = ifelse(SOG == 0, yes = cumsum(c(0, diff(!SOG) > 0)), no = 0))
df$Stops
#[1] 0 1 1 1 0 0 0 2 2 0 0 3 3 3
EDIT: As an aside to those of us who are still beginners, many of the answers to this question make use of logicals (i.e. TRUE, FALSE). !
before a numeric variable like SOG
tests whether the value is 0
and assigns TRUE
if it is, and FALSE
otherwise.
SOG
#[1] 4 0 0 0 3 4 5 0 0 1 2 0 0 0
!SOG
#[1] FALSE TRUE TRUE TRUE FALSE FALSE FALSE TRUE TRUE FALSE FALSE
#[12] TRUE TRUE TRUE
diff()
takes the difference between the value and the one before it. Note that there is one less element in this list than in SOG
since the first element doesn't have a lag with which to compute a difference. When it comes to logicals, diff(!SOG)
produces 1
for TRUE - FALSE = 1
, FALSE - TRUE = -1
, and 0
otherwise.
diff(SOG)
#[1] -4 0 0 3 1 1 -5 0 1 1 -2 0 0
diff(!SOG)
#[1] 1 0 0 -1 0 0 1 0 -1 0 1 0 0
So cumsum(diff(!SOG) > 0)
just focuses on the TRUE - FALSE
changes
cumsum(diff(!SOG) > 0)
#[1] 1 1 1 1 1 1 2 2 2 2 3 3 3
But since the list of differences is one element shorter, we can append an element:
cumsum(c(0, diff(!SOG) > 0)) #Or cumsum( c(0, diff(!SOG)) > 0 )
#[1] 0 1 1 1 1 1 1 2 2 2 2 3 3 3
Then either "multiply" that list by !SOG
as in @akrun
's answer or use the ifelse()
command. If a particular element of SOG == 0
, we use the corresponding element from cumsum(c(0, diff(!SOG) > 0))
; if it isn't 0
, we assign 0
.