Calculate rank based on Consecutive Dates (Return to 1 if there is a gap days between dates)

I am trying to get the rank column of this table.

Key	Date1	Name1	Location1
20	2021-12-30	Ann	Site A
23	2021-12-31	Ann	Site A
26	2022-01-03	Ann	Site A
28	2022-01-04	Ann	Site A
29	2022-01-05	Ann	Site A
32	2022-01-06	Ann	Site A
62	2022-01-08	Ann	Site A
63	2022-01-10	Ann	Site A
64	2022-01-11	Ann	Site A
65	2022-01-12	Ann	Site A
66	2022-01-13	Ann	Site A

Desired Output:

Key	Date1	Name1	Location1	Rank1
20	2021-12-30	Ann	Site A	1
23	2021-12-31	Ann	Site A	2
26	2022-01-03	Ann	Site A	1
28	2022-01-04	Ann	Site A	2
29	2022-01-05	Ann	Site A	3
32	2022-01-06	Ann	Site A	4
62	2022-01-08	Ann	Site A	1
63	2022-01-10	Ann	Site A	1
64	2022-01-11	Ann	Site A	2
65	2022-01-12	Ann	Site A	3
66	2022-01-13	Ann	Site A	4

I tried to use the script that I got from another post, but I still can't get my desired output of Rank Column

select
    t.Date1,
    t.Name1,
    t.Location1,
    row_number() over (partition by Name1, Location1, grp order by KEY1) as Rank1
from
(
select 
t.*,
sum(case when gap > 1 then 1 else 0 end) over (partition by Name1, Location1, order by KEY1) as grp
from
    (select
     t.*, 
     isnull(datediff(day, Date1, lag(Date1) over (partition by Name1, Location1, order by KEY1)), 1) as gap
     from Table1 t
    ) t
) t;

Thank you.

For me I find it easiest to process in this order:

highlight all cases where previous row is more than a day earlier
for every row where the gap is more than a day, output a 1, instead of a zero
from there, add up all the preceding values to create a group for each gap - since every gap will have a 1, each new gap will make the cumulative sum increase by 1, but every subsequent 0 will not, so it will keep all the consecutive days in an "island"
finally, use partition by to apply a new row number sequence to each group.

Resulting query is:

;WITH FindTheGaps AS 
(
  -- highlight all cases where previous row is more than a day earlier
  SELECT *, gap = CASE WHEN Date1 > 
    DATEADD(DAY, 1, LAG(Date1,1) OVER 
    (ORDER BY Date1))
    THEN 1 ELSE 0 END FROM dbo.Table1
),
MarkTheGaps AS
(
  -- add up all the preceding values to create a group for each gap
  SELECT *, grp = SUM(gap) OVER
    (ORDER BY Date1 
      ROWS UNBOUNDED PRECEDING) FROM FindTheGaps
)
SELECT Key1, Date1, Name1, Location1, 
  Rank1 = ROW_NUMBER() OVER 
    (PARTITION BY grp ORDER BY Date1)
  FROM MarkTheGaps ORDER BY Name1, Location1, Date1;

Output:

Key1	Date1	Name1	Location1	Rank1
20	2021-12-30	Ann	Site A	1
23	2021-12-31	Ann	Site A	2
26	2022-01-03	Ann	Site A	1
28	2022-01-04	Ann	Site A	2
29	2022-01-05	Ann	Site A	3
32	2022-01-06	Ann	Site A	4
62	2022-01-08	Ann	Site A	1
63	2022-01-10	Ann	Site A	1
64	2022-01-11	Ann	Site A	2
65	2022-01-12	Ann	Site A	3
66	2022-01-13	Ann	Site A	4

Example db<>fiddle

Your sample data had all the same Name1 and Location1 values, but if you need the ranks to start over both when there are gaps in the date and when the name or location changes, the logic is not really any different, you can just add those to all the OVER() clauses, e.g.:

;WITH FindTheGaps AS 
(
  -- highlight all cases where previous row is more than a day earlier
  SELECT *, gap = CASE WHEN Date1 > 
    DATEADD(DAY, 1, LAG(Date1,1) OVER 
    (PARTITION BY Name1, Location1 ORDER BY Date1))
    THEN 1 ELSE 0 END FROM dbo.Table1
),
MarkTheGaps AS
(
  -- add up all the preceding values to create a group for each gap
  SELECT *, grp = SUM(gap) OVER
    (PARTITION BY Name1, Location1 ORDER BY Date1 
      ROWS UNBOUNDED PRECEDING) FROM FindTheGaps
)
SELECT Key1, Date1, Name1, Location1, 
  Rank1 = ROW_NUMBER() OVER 
    (PARTITION BY Name1, Location1, grp ORDER BY Date1)
  FROM MarkTheGaps ORDER BY Name1, Location1, Date1;

With this source data it gives the same output.

This is a gaps and islands problem with a twist that we have to come up with a pseudo group for each continuous run of dates. Consider the following approach, which uses the difference in row numbers methods:

WITH cte AS (
    SELECT *, CASE WHEN DATEDIFF(day,
                                 LAG(Date1) OVER (PARTITION BY Location1
                                                  ORDER BY Date1),
                                 Date1) > 1
                   THEN 1 ELSE 0 END AS Marker
    FROM yourTable
),
cte2 AS (
    SELECT *, SUM(Marker) OVER (PARTITION BY Location1 ORDER BY Date1) AS grp
    FROM cte
)

SELECT [Key], Date1, Name1, Location1,
       ROW_NUMBER() OVER (PARTITION BY Location1, grp ORDER BY Date1) AS Rank1
FROM cte2
ORDER BY Location1, Date1;

Demo

The strategy above is to generate a value of 1 in a column called grp whenever we detect that the preceding date is not continuous with the current date. We can sum over the column then to generate pseudo groups, which can then be used with ROW_NUMBER to generate the rank values you want.

Calculate rank based on Consecutive Dates (Return to 1 if there is a gap days between dates)

Demo

Related

Recent Posts