rle-like function that catches "run" of adjacent integers

Solution 1:

1) Calculate values and then lengths based on values

s <- split(x, cumsum(c(0, diff(x) != 1)))
run.info <- list(lengths = unname(sapply(s, length)), values = unname(s))

Running it using x from the question gives this:

> str(run.info)
List of 2
 $ lengths: int [1:5] 3 6 1 2 6
 $ values :List of 5
  ..$ : num [1:3] 3 4 5
  ..$ : num [1:6] 10 11 12 13 14 15
  ..$ : num 17
  ..$ : num [1:2] 22 23
  ..$ : num [1:6] 35 36 37 38 39 40

2) Calculate lengths and then values based on lengths

Here is a second solution based on Gregor's length calculation:

lens <- rle(x - seq_along(x))$lengths 
list(lengths = lens, values = unname(split(x, rep(seq_along(lens), lens))))

3) Calculate lengths and values without using other

This one seems inefficient since it calculates each of lengths and values from scratch and it also seems somewhat overly complex but it does manage to get it all down to a single statement so I thought I would add it as well. Its basically just a mix of the prior two solutions marked 1) and 2) above. Nothing really new relative to those two.

list(lengths = rle(x - seq_along(x))$lengths,
           values = unname(split(x, cumsum(c(0, diff(x) != 1)))))

EDIT: Added second solution.

EDIT: Added third solution.  

Solution 2:

How about

rle(x - 1:length(x))$lengths   
# 3 6 1 2 6

The lengths are what you want, though I'm blanking on an equally clever way to get the proper values, but with cumsum() and the original x they're very accessible.

Solution 3:

As you say, it is easy enough to write something similar to rle. Indeed, adjusting the code for rle by adding + 1 might give something like

rle_consec <- function(x)
{
    if (!is.vector(x) && !is.list(x))
        stop("'x' must be an atomic vector")
    n <- length(x)
    if (n == 0L)
    return(structure(list(lengths = integer(), values = x),
             class = "rle_consec"))
    y <- x[-1L] != x[-n] + 1
    i <- c(which(y | is.na(y)), n)
    structure(list(lengths = diff(c(0L, i)), values = x[i]),
              class = "rle_consec")
}

and using your example

> x <- c(3:5, 10:15, 17, 22, 23, 35:40)
> rle_consec(x)
$lengths
[1] 3 6 1 2 6

$values
[1]  5 15 17 23 40

attr(,"class")
[1] "rle_consec"

which is what John expected.

You could adjust the code further to give the first of each consecutive subsequence rather than the last.