rle-like function that catches "run" of adjacent integers
Solution 1:
1) Calculate values and then lengths based on values
s <- split(x, cumsum(c(0, diff(x) != 1)))
run.info <- list(lengths = unname(sapply(s, length)), values = unname(s))
Running it using x
from the question gives this:
> str(run.info)
List of 2
$ lengths: int [1:5] 3 6 1 2 6
$ values :List of 5
..$ : num [1:3] 3 4 5
..$ : num [1:6] 10 11 12 13 14 15
..$ : num 17
..$ : num [1:2] 22 23
..$ : num [1:6] 35 36 37 38 39 40
2) Calculate lengths and then values based on lengths
Here is a second solution based on Gregor's length calculation:
lens <- rle(x - seq_along(x))$lengths
list(lengths = lens, values = unname(split(x, rep(seq_along(lens), lens))))
3) Calculate lengths and values without using other
This one seems inefficient since it calculates each of lengths
and values
from scratch and it also seems somewhat overly complex but it does manage to get it all down to a single statement so I thought I would add it as well. Its basically just a mix of the prior two solutions marked 1) and 2) above. Nothing really new relative to those two.
list(lengths = rle(x - seq_along(x))$lengths,
values = unname(split(x, cumsum(c(0, diff(x) != 1)))))
EDIT: Added second solution.
EDIT: Added third solution.
Solution 2:
How about
rle(x - 1:length(x))$lengths
# 3 6 1 2 6
The lengths are what you want, though I'm blanking on an equally clever way to get the proper values, but with cumsum()
and the original x
they're very accessible.
Solution 3:
As you say, it is easy enough to write something similar to rle
. Indeed, adjusting the code for rle
by adding + 1
might give something like
rle_consec <- function(x)
{
if (!is.vector(x) && !is.list(x))
stop("'x' must be an atomic vector")
n <- length(x)
if (n == 0L)
return(structure(list(lengths = integer(), values = x),
class = "rle_consec"))
y <- x[-1L] != x[-n] + 1
i <- c(which(y | is.na(y)), n)
structure(list(lengths = diff(c(0L, i)), values = x[i]),
class = "rle_consec")
}
and using your example
> x <- c(3:5, 10:15, 17, 22, 23, 35:40)
> rle_consec(x)
$lengths
[1] 3 6 1 2 6
$values
[1] 5 15 17 23 40
attr(,"class")
[1] "rle_consec"
which is what John expected.
You could adjust the code further to give the first of each consecutive subsequence rather than the last.