r collapse timeline separated by 31 days by ID
This question is similar to a question here r collapse by year by ID
However I like to collapse timelines by ID and State , only if they the gap between their timelines is 31 days apart. If the gap is more than 31 days then they are not collapsed, they start with a new row. For example if this is my dataset
ID From To State
1 2004-04-05 2005-02-05 MD
1 2005-03-05 2005-03-05 MD
1 2005-04-05 2005-10-05 DC
1 2006-03-05 2006-10-05 DC
1 2006-11-05 2007-03-05 DC
1 2007-04-05 2007-06-05 MD
1 2008-03-05 2008-11-05 MD
1 2008-12-05 2010-08-05 MD
1 2010-09-05 2012-11-05 MD
2 2003-05-05 2004-08-05 OR
2 2004-09-05 2009-03-05 OR
2 2010-06-05 2010-08-05 AZ
2 2013-06-05 2015-06-05 AZ
The final dataset after collapsing will looks like this
ID From To State
1 2004-04-05 2005-03-05 MD
1 2005-04-05 2005-10-05 DC
1 2006-04-05 2007-03-05 DC
1 2007-04-05 2007-06-05 MD
1 2008-03-05 2012-11-05 MD
2 2003-05-05 2009-03-05 OR
2 2010-06-05 2010-08-05 AZ
2 2013-06-05 2015-06-05 AZ
Any suggestions regarding this is much appreciated.
TestCase 2 :
ID From To State
1 2003-09-05 2003-11-05 MD
1 2004-09-05 2007-05-05 TX
1 2007-06-05 2007-07-05 DC
1 2007-08-05 2009-07-05 DC
1 2011-11-05 2014-03-05 MD
1 2014-05-05 2017-06-05 MD
Expected results
ID From To State
1 2003-09-05 2003-11-05 MD
1 2004-09-05 2007-05-05 TX
1 2007-06-05 2009-07-05 DC
1 2011-11-05 2017-06-05 MD
Subtract current From
date from the previous To
date and create a new grouping column and select first
From
value and last
To
value within each group.
library(dplyr)
df %>%
mutate(across(c(From, To), as.Date)) %>%
group_by(ID, State,
group = cumsum(From - dplyr::lag(To, default = as.Date('1970-01-01')) > 31)) %>%
summarise(From = first(From),
To = last(To), .groups = 'drop') %>%
select(-group) %>%
arrange(ID, From)
# ID State From To
# <int> <chr> <date> <date>
#1 1 MD 2004-04-05 2005-03-05
#2 1 DC 2005-04-05 2005-10-05
#3 1 DC 2006-03-05 2007-03-05
#4 1 MD 2007-04-05 2007-06-05
#5 1 MD 2008-03-05 2012-11-05
#6 2 OR 2003-05-05 2009-03-05
#7 2 AZ 2010-06-05 2010-08-05
#8 2 AZ 2013-06-05 2015-06-05