Modify Group By to achieve correct results

I have large data set with historical data of the INACTIVE Clients and their billing (services) information. I need to group data by ClientId, Service and Year AND produce aggregated fields, such as TotalCharges, TotalProvidedServices and DaysStayedWithUs (Issue with this column).

I was able successfully Group data by ClientId and Service, however grouping by Year producing the wrong result and I know why.

The reason is because Year column produced from ServiceDate column, that originally was used for creating FirstServiceDate and LastServiceDate => to be able produce DaysStayedWithUs (LastServiceDate - FirstServiceDate will give us Amount of Days the Client was with Us (DaysStayedWithUs)). And when I Group by Year, query finds FirstServiceDate, LastServiceDate and calculates DaysStayedWithUs based on it....therefore I have to find another way to achieve result I need (showed below).

Client Table:

 ClientId

1234567890

Billing Table:

ClientId       Service      ServiceDate   ChargesTotal

1234567890     Cleaning       12/4/2018       190.17
1234567890     Cleaning       1/22/2019        97.8
1234567890     Cleaning       1/29/2019        97.8
1234567890     Cleaning       2/5/2019         97.8
1234567890     Cleaning       2/12/2019        97.8
1234567890     Cleaning       3/5/2019         97.8
1234567890     Cleaning       2/19/2019        97.8
1234567890     Cleaning       3/12/2019        97.8
1234567890     Cleaning       3/19/2019        97.8
1234567890     Cleaning       3/26/2019        97.8

My code:

SELECT GroupedTable.ClientId, 
        GroupedTable.Service, 
        ---GroupedTable.Year,
        GroupedTable.FirstServiceDate, 
        GroupedTable.LastServiceDate, 
        GroupedTable.TotalCharges, 
        GroupedTable.TotalProvidedServices, 
        GroupedTable.DaysStayedWithUs

FROM (SELECT MainTable.ClientId, 
        MainTable.Service,
        SUM(MainTable.Charges) AS TotalCharges,      
        COUNT(*) AS TotalProvidedServices, 
        DATEDIFF(DAY, MIN(MainTable.ServiceDate), MAX(MainTable.ServiceDate)) as DaysStayedWithUs,
        ---MainTable.Year,
        MIN(MainTable.ServiceDate) AS FirstServiceDate,
        MAX(MainTable.ServiceDate) AS LastServiceDate

        FROM (SELECT c.ClientId, 
                b.Service,
                b.ServiceDate,
                ---YEAR(b.ServiceDate) AS Year, 
                b.Charges

            FROM Client as c

            LEFT JOIN Billing as b
                ON (c.ClientId = b.ClientId)

            WHERE b.ClientId = 1234567890

            ) as MainTable

---GROUP BY MainTable.ClientId, MainTable.ServiceIdentifier, MainTable.Year) AS GroupedTable
GROUP BY MainTable.ClientId, MainTable.Service) AS GroupedTable

Code above produces correct results: (Year deselected)

enter image description here

If I include Year in the Group By, output is:

enter image description here

P.S. I marked cells in Yellow that have issues (technically FirstServiceDate and LastServiceDate are need to be corrected and then DaysStayedWithUs will be adjusted by itself)

Result I need to accomplish:

enter image description here

Hope it is possible to achieve. Thank You.


Solution 1:

Use the window functions MIN OVER and MAX OVER in order to see whether a year is the first/last year for a client and service.

SELECT 
  c.clientid, 
  b.service,
  YEAR(b.servicedate) as year, 
  SUM(b.charges) AS total_charges,
  COUNT(*) AS total_provided_services
  CASE WHEN YEAR(b.servicedate) = MIN(YEAR(b.servicedate)) OVER (PARTITION BY c.clientid, b.service)
    THEN MIN(b.servicedate) ELSE DATEFROMPARTS(YEAR(b.servicedate), 1, 1)
  END AS first_service_date, 
  CASE WHEN YEAR(b.servicedate) = MAX(YEAR(b.servicedate)) OVER (PARTITION BY c.clientid, b.service)
    THEN MAX(b.servicedate) ELSE DATEFROMPARTS(YEAR(b.servicedate), 12, 31)
  END AS last_service_date,
  DATEDIFF(DAY, 
    CASE WHEN YEAR(b.servicedate) = MIN(YEAR(b.servicedate)) OVER (PARTITION BY c.clientid, b.service)
      THEN MIN(b.servicedate) ELSE DATEFROMPARTS(YEAR(b.servicedate), 1, 1) END, 
    CASE WHEN YEAR(b.servicedate) = MAX(YEAR(b.servicedate)) OVER (PARTITION BY c.clientid, b.service)
      THEN MAX(b.servicedate) ELSE DATEFROMPARTS(YEAR(b.servicedate), 12, 31) END 
  ) AS days_stayed_with_us
FROM client AS c
LEFT JOIN billing AS b ON b.clientid = c.clientid
WHERE c.clientid = 1234567890
GROUP BY c.clientid, b.service, YEAR(b.timebillingservicedate)
ORDER BY c.clientid, b.service, YEAR(b.timebillingservicedate);