cumulative sum using R

I have a data file that look like this data The question: At what date is the total incidence (cumulative sum) between Venus and Mar is more than 2000?


I've created a simple example using an index instead of a Date column:

df <- data.frame(country = c(rep("Mar",10), rep("Venus",10)), 
                 incidence = runif(20,0,30),
                 index=seq(1,20,1))

library(dplyr)
df %>% 
  group_by(country) %>%
  mutate(cumInc = cumsum(incidence)) %>% 
  filter(cumInc > 100) %>% 
  filter(index==min(index))

country incidence index cumInc
  <fct>       <dbl> <dbl>  <dbl>
1 Mar          29.2    10   108.
2 Venus        22.5    16   110.

You can just change 100 to your threshold and change index to date to get the first Date for Venus and for Mar when the cumulative sum exceeds the given threshold. So e.g.:

df %>% 
      group_by(country) %>%
      mutate(cumInc = cumsum(incidence)) %>% 
      filter(cumInc > **Your Threshold**) %>% 
      filter(date==min(date))

If you want to obtain a data.frame later you can add simply %>% as.data.frame().

If you want to save you information just use something like:

result <- df %>% 
   group_by(...