Calculate rank based on Consecutive Dates (Return to 1 if there is a gap days between dates)
I am trying to get the rank column of this table.
Key | Date1 | Name1 | Location1 |
---|---|---|---|
20 | 2021-12-30 | Ann | Site A |
23 | 2021-12-31 | Ann | Site A |
26 | 2022-01-03 | Ann | Site A |
28 | 2022-01-04 | Ann | Site A |
29 | 2022-01-05 | Ann | Site A |
32 | 2022-01-06 | Ann | Site A |
62 | 2022-01-08 | Ann | Site A |
63 | 2022-01-10 | Ann | Site A |
64 | 2022-01-11 | Ann | Site A |
65 | 2022-01-12 | Ann | Site A |
66 | 2022-01-13 | Ann | Site A |
Desired Output:
Key | Date1 | Name1 | Location1 | Rank1 |
---|---|---|---|---|
20 | 2021-12-30 | Ann | Site A | 1 |
23 | 2021-12-31 | Ann | Site A | 2 |
26 | 2022-01-03 | Ann | Site A | 1 |
28 | 2022-01-04 | Ann | Site A | 2 |
29 | 2022-01-05 | Ann | Site A | 3 |
32 | 2022-01-06 | Ann | Site A | 4 |
62 | 2022-01-08 | Ann | Site A | 1 |
63 | 2022-01-10 | Ann | Site A | 1 |
64 | 2022-01-11 | Ann | Site A | 2 |
65 | 2022-01-12 | Ann | Site A | 3 |
66 | 2022-01-13 | Ann | Site A | 4 |
I tried to use the script that I got from another post, but I still can't get my desired output of Rank Column
select
t.Date1,
t.Name1,
t.Location1,
row_number() over (partition by Name1, Location1, grp order by KEY1) as Rank1
from
(
select
t.*,
sum(case when gap > 1 then 1 else 0 end) over (partition by Name1, Location1, order by KEY1) as grp
from
(select
t.*,
isnull(datediff(day, Date1, lag(Date1) over (partition by Name1, Location1, order by KEY1)), 1) as gap
from Table1 t
) t
) t;
Thank you.
For me I find it easiest to process in this order:
- highlight all cases where previous row is more than a day earlier
- for every row where the gap is more than a day, output a 1, instead of a zero
- from there, add up all the preceding values to create a group for each gap - since every gap will have a 1, each new gap will make the cumulative sum increase by 1, but every subsequent 0 will not, so it will keep all the consecutive days in an "island"
- finally, use partition by to apply a new row number sequence to each group.
Resulting query is:
;WITH FindTheGaps AS
(
-- highlight all cases where previous row is more than a day earlier
SELECT *, gap = CASE WHEN Date1 >
DATEADD(DAY, 1, LAG(Date1,1) OVER
(ORDER BY Date1))
THEN 1 ELSE 0 END FROM dbo.Table1
),
MarkTheGaps AS
(
-- add up all the preceding values to create a group for each gap
SELECT *, grp = SUM(gap) OVER
(ORDER BY Date1
ROWS UNBOUNDED PRECEDING) FROM FindTheGaps
)
SELECT Key1, Date1, Name1, Location1,
Rank1 = ROW_NUMBER() OVER
(PARTITION BY grp ORDER BY Date1)
FROM MarkTheGaps ORDER BY Name1, Location1, Date1;
Output:
Key1 | Date1 | Name1 | Location1 | Rank1 |
---|---|---|---|---|
20 | 2021-12-30 | Ann | Site A | 1 |
23 | 2021-12-31 | Ann | Site A | 2 |
26 | 2022-01-03 | Ann | Site A | 1 |
28 | 2022-01-04 | Ann | Site A | 2 |
29 | 2022-01-05 | Ann | Site A | 3 |
32 | 2022-01-06 | Ann | Site A | 4 |
62 | 2022-01-08 | Ann | Site A | 1 |
63 | 2022-01-10 | Ann | Site A | 1 |
64 | 2022-01-11 | Ann | Site A | 2 |
65 | 2022-01-12 | Ann | Site A | 3 |
66 | 2022-01-13 | Ann | Site A | 4 |
- Example db<>fiddle
Your sample data had all the same Name1
and Location1
values, but if you need the ranks to start over both when there are gaps in the date and when the name or location changes, the logic is not really any different, you can just add those to all the OVER()
clauses, e.g.:
;WITH FindTheGaps AS
(
-- highlight all cases where previous row is more than a day earlier
SELECT *, gap = CASE WHEN Date1 >
DATEADD(DAY, 1, LAG(Date1,1) OVER
(PARTITION BY Name1, Location1 ORDER BY Date1))
THEN 1 ELSE 0 END FROM dbo.Table1
),
MarkTheGaps AS
(
-- add up all the preceding values to create a group for each gap
SELECT *, grp = SUM(gap) OVER
(PARTITION BY Name1, Location1 ORDER BY Date1
ROWS UNBOUNDED PRECEDING) FROM FindTheGaps
)
SELECT Key1, Date1, Name1, Location1,
Rank1 = ROW_NUMBER() OVER
(PARTITION BY Name1, Location1, grp ORDER BY Date1)
FROM MarkTheGaps ORDER BY Name1, Location1, Date1;
With this source data it gives the same output.
This is a gaps and islands problem with a twist that we have to come up with a pseudo group for each continuous run of dates. Consider the following approach, which uses the difference in row numbers methods:
WITH cte AS (
SELECT *, CASE WHEN DATEDIFF(day,
LAG(Date1) OVER (PARTITION BY Location1
ORDER BY Date1),
Date1) > 1
THEN 1 ELSE 0 END AS Marker
FROM yourTable
),
cte2 AS (
SELECT *, SUM(Marker) OVER (PARTITION BY Location1 ORDER BY Date1) AS grp
FROM cte
)
SELECT [Key], Date1, Name1, Location1,
ROW_NUMBER() OVER (PARTITION BY Location1, grp ORDER BY Date1) AS Rank1
FROM cte2
ORDER BY Location1, Date1;
Demo
The strategy above is to generate a value of 1 in a column called grp
whenever we detect that the preceding date is not continuous with the current date. We can sum over the column then to generate pseudo groups, which can then be used with ROW_NUMBER
to generate the rank values you want.