Remove reversed accounting entries from table SQL Server [duplicate]
I have a query that will result in a customer bill being created on our SSRS 2008 R2 server. The SQL Server instance is also 2008 R2. The query is large and I don't want to post the entire thing for security reasons, etc.
What I need to do with the example data below, is to remove the two rows with 73.19 and -73.19 from the result set. So, if two rows have the same absolute value in the LineBalance column and their sum is 0 AND if they have the same value in the REF1 column, the should be removed from the result set. The line with REF1 = 14598 and a line balance of 281.47 should still be returned in the result set and the other two rows below with REF1 = 14598 should not be returned.
The point of this is to "hide" accounting errors and their correction from the customer. by "hide" I mean, not show it on the bill they get in the mail. What happened here is the customer was mistakenly billed 73.19 when they should have been billed 281.47. So, our AR dept. returned 73.19 to their account and charged them the correct amount of 281.47. As you can see they all have the same REF1 value.
I would add a field that would contain explicit flag telling you that a certain charge was a mistake/reversal of a mistake and then it is trivial to filter out such rows. Doing it on the fly could make your reports rather slow.
But, to solve the given problem as is we can do like this. The solution assumes that SysInvNum
is unique.
Create a table with sample data
DECLARE @T TABLE (SysInvNum int, REF1 int, LineBalance money);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (3344299, 14602, 558.83);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (3344298, 14598, 281.47);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (3344297, 14602, -95.98);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (3344296, 14598, -73.19);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (3341758, 14598, 73.19);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (11, 100, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (12, 100, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (13, 100, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (21, 200, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (22, 200, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (23, 200, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (31, 300, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (32, 300, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (33, 300, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (34, 300, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (41, 400, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (42, 400, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (43, 400, 50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (44, 400, -50.00);
INSERT INTO @T (SysInvNum, REF1, LineBalance) VALUES (45, 400, 50.00);
I've added few more cases that have multiple mistakes.
Number and count rows
SELECT
SysInvNum
, REF1
, LineBalance
, ROW_NUMBER() OVER(PARTITION BY REF1, LineBalance ORDER BY SysInvNum) AS rn
, COUNT(*) OVER(PARTITION BY REF1, ABS(LineBalance)) AS cc1
FROM @T AS TT
This is the result set:
SysInvNum REF1 LineBalance rn cc1
11 100 50.00 1 3
12 100 -50.00 1 3
13 100 50.00 2 3
21 200 -50.00 1 3
23 200 50.00 1 3
22 200 -50.00 2 3
31 300 -50.00 1 4
32 300 50.00 1 4
33 300 -50.00 2 4
34 300 50.00 2 4
41 400 50.00 1 5
42 400 -50.00 1 5
43 400 50.00 2 5
44 400 -50.00 2 5
45 400 50.00 3 5
3341758 14598 73.19 1 2
3344296 14598 -73.19 1 2
3344298 14598 281.47 1 1
3344297 14602 -95.98 1 1
3344299 14602 558.83 1 1
You can see that those rows that have mistakes have count > 1. Also, pairs of mistakes have same row numbers. So, we need to remove/hide those rows that have count > 1 and those that have two same row numbers.
Determine rows to remove
WITH
CTE_rn
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, ROW_NUMBER() OVER(PARTITION BY REF1, LineBalance ORDER BY SysInvNum) AS rn
, COUNT(*) OVER(PARTITION BY REF1, ABS(LineBalance)) AS cc1
FROM @T AS TT
)
, CTE_ToRemove
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, COUNT(*) OVER(PARTITION BY REF1, rn) AS cc2
FROM CTE_rn
WHERE CTE_rn.cc1 > 1
)
SELECT *
FROM CTE_ToRemove
WHERE CTE_ToRemove.cc2 = 2
This is another intermediate result:
SysInvNum REF1 LineBalance cc2
12 100 -50.00 2
11 100 50.00 2
21 200 -50.00 2
23 200 50.00 2
32 300 50.00 2
31 300 -50.00 2
33 300 -50.00 2
34 300 50.00 2
42 400 -50.00 2
41 400 50.00 2
43 400 50.00 2
44 400 -50.00 2
3344296 14598 -73.19 2
3341758 14598 73.19 2
Now, we just put all this together.
Final query
WITH
CTE_rn
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, ROW_NUMBER() OVER(PARTITION BY REF1, LineBalance ORDER BY SysInvNum) AS rn
, COUNT(*) OVER(PARTITION BY REF1, ABS(LineBalance)) AS cc1
FROM @T AS TT
)
, CTE_ToRemove
AS
(
SELECT
SysInvNum
, REF1
, LineBalance
, COUNT(*) OVER(PARTITION BY REF1, rn) AS cc2
FROM CTE_rn
WHERE CTE_rn.cc1 > 1
)
SELECT *
FROM @T AS TT
WHERE
TT.SysInvNum NOT IN
(
SELECT CTE_ToRemove.SysInvNum
FROM CTE_ToRemove
WHERE CTE_ToRemove.cc2 = 2
)
ORDER BY SysInvNum;
Result:
SysInvNum REF1 LineBalance
13 100 50.00
22 200 -50.00
45 400 50.00
3344297 14602 -95.98
3344298 14598 281.47
3344299 14602 558.83
Note, that final result doesn't have any rows with REF = 300, because there were two corrected mistakes, that balanced each other completely.
Most AR/billing systems treat "credit memos" (the negative amount) similar to cash, in which case the -73.19 would get applied to the 73.19 LineBalance
the same as if the customer had paid that amount, resulting in a $0 balance.
OPTION 1:
Do you handle cash receipts and applications in this sytem? If so, you may be able to pull data from those cash application tables to show the tie between SysInvNum 3344296 and 3341758.
OPTION 2:
I'm assuming that PayAdjust
column is used to reduce the balance after a customer has paid, and that LineBalance is a calculated column that is Charges + PayAdjust
.
Most of the time when this occurs, the AR department would be responsible for applying the credit memo to an open invoice, so that the PayAdjust
column would net $0 between the 2 rows, and this would cause the LineBalance
to also be $0 on each of the 2 rows. It may just be a training issue for the system that is being used.
This would cause the 3 rows in question to look like this, so you don't have an issue, you would just exclude the rows by adding where LineBalance <> 0
to your query since the AR department (which applied the credit to begin with and so knows the answer to this question) explicitly stated which LineBalance
the credit applies to:
Option 2 Preferred data structure:
SysInvNum REF1 Charges PayAdjust LineBalance
----------- ----------- --------------------- --------------------- ---------------------
3344298 14598 281.47 0.00 281.47
3344296 14598 -73.19 73.19 0.00
3341758 14598 73.19 -73.19 0.00
OPTION 3:
Without having this data from Option 1 or 2, you have make many assumptions and run the risk of inadvertently hiding the wrong rows.
That being said, here is a query that attempts to do what you are asking, but I would highly recommend checking with the AR dept to see if they can update "PayAdjust" for these records instead.
I added several test cases of scenarios that could cause issues, but this may not cover all the bases.
This query will only hide rows where one distinct matching negative value is found for a positive value, for the same REF1 and same DueDate. It also makes sure that the original Charge invoice ID is prior to the credit since it can be assumed that a credit would not occur prior to the actual charge (Test case 6 shows both rows still because the credit has an SysInvNum that occurred before the charge). If more than once match is found per REF1, DueDate, and LineBalance
, then it will not hide the corresponding charge and credit lines (test cases 2 & 4). Test case 3 sums in total to 0, but it still shows all 3 rows because the LineBalance values do not match exactly. These are all assumptions that I made to handle edge cases, so they can be adjusted as needed.
CREATE TABLE #SysInvTable (SysInvNum int not null primary key, REF1 int, Charges money, PayAdjust money, LineBalance as Charges + PayAdjust, DueDate date, REF2 int, Remark varchar(50), REM varchar(50));
INSERT INTO #SysInvTable(SysInvNum, REF1, Charges, PayAdjust, DueDate, Remark)
VALUES
--.....................................
--Your test case
(3344298, 14598, 281.47, 0, '2014-12-08','Your original test case. This one should stay.')
, (3344296, 14598, -73.19, 0, '2014-12-08',null)
, (3341758, 14598, 73.19, 0, '2014-12-08',null)
--.....................................
--Test case 2: How do you match these up?
, (2001, 2, 73.19, 0, '2015-01-06','Charge 2.1')
, (2002, 2, 73.19, 0, '2015-01-06','Charge 2.2')
, (2003, 2, 73.19, 0, '2015-01-06','Charge 2.3')
, (2004, 2, -73.19, 0, '2015-01-06','Credit for charge 2.3')
, (2005, 2, -73.19, 0, '2015-01-06','Credit for charge 2.1')
--.....................................
--Test case 3
, (3001, 3, 73.19, 0, '2015-01-06','Charge 3.1')
, (3002, 3, 73.19, 0, '2015-01-06','Charge 3.2')
, (3003, 3, -146.38, 0, '2015-01-06','Credit for charges 3.1 and 3.2')
--.....................................
--Test case 4: Do you hide 4001 or 4002?
, (4001, 4, 73.19, 0, '2015-01-06','Cable')
, (4002, 4, 73.19, 0, '2015-01-06','Internet')
, (4003, 4, -73.19, 0, '2015-01-06','Misc Credit')
--.....................................
--Test case 5: remove all lines except the first
, (5000, 5, 9.99, 0, '2015-01-06','Charge 5.0 (Should stay)')
, (5001, 5, 11.11, 0, '2015-01-06','Charge 5.1')
, (5002, 5, 22.22, 0, '2015-01-06','Charge 5.2')
, (5003, 5, 33.33, 0, '2015-01-06','Charge 5.3')
, (5004, 5, -11.11, 0, '2015-01-06','Credit for charge 5.1')
, (5005, 5, -33.33, 0, '2015-01-06','Credit for charge 5.3')
, (5006, 5, -22.22, 0, '2015-01-06','Credit for charge 5.2')
--.....................................
--Test case 6: credit occurs before charge, so keep both
, (6000, 6, -73.19, 0, '2015-01-06','Credit occurs before charge')
, (6001, 6, 73.19, 0, '2015-01-06','Charge 6.1')
;
SELECT i.*
FROM #SysInvTable i
WHERE i.SysInvNum not in
(
SELECT IngoreInvNum = case when c.N = 1 then max(t.SysInvNum) else min(t2.SysInvNum) end
FROM #SysInvTable t
INNER JOIN #SysInvTable t2
ON t.ref1 = t2.ref1
AND t.DueDate = t2.DueDate
CROSS APPLY (SELECT 1 AS N UNION ALL SELECT 2 as N) AS c --used to both both T and T2 SysInvNum's to exclude
WHERE 1=1
AND t.LineBalance > 0 AND t2.LineBalance < 0
AND t.SysInvNum < t2.SysInvNum --make sure the credit came in after the positive SysInvNum
AND t.LineBalance = t2.LineBalance * -1
GROUP BY t.REF1, t.DueDate, abs(t.LineBalance), c.n
HAVING Count(*) = 1
)
;
DROP TABLE #SysInvTable;