SSIS: flagging ALL the Data Quality issues in each row with Conditional Split
Assuming I understand your problem, I would structure this as a series of columns added to the data flow via Derived Column transformation.
Assume I have inbound like this
SELECT col1, col2, col3, col4;
My business rules
- col1 cannot contain nulls DQ_001
- col2 must be greater than 5 DQ_002
- col3 must be less than 3 DQ_003
- col4 has no rules
From my source, I would add a Derived Column Component
New Column named Pass_DQ_001
as a boolean with an expression !isnull([col1])
New Column named Pass_DQ_002
as a boolean with an expression [col2] > 5
New Column named Pass_DQ_003
as a boolean with an expression [col3] < 3
etc
At this point, your data row could look something like
NULL, 4, 4, No Rules, False, False, False
ABC, 7, 2, Still No Rules, True, True, True
...
If you have more than 3 to 4 data quality conditions, I'd add a final Derived Column component into the mix
New column IsValid
as yet another boolean with an expression like Pass_DQ_001 && Pass_DQ_002 && Pass_DQ_003
etc
The penalty for adding additional columns is trivial compared to trying to debug complex expressions in a dataflow so don't do it - especially for bit columns.
At this point, you can put a data viewer in there and verify that yes, all my logic is correct. If it's wrong, you can zip in and figure out why DQ_036 isn't flagging correctly.
Otherwise, you're ready to then connect the data flow to a Conditional Split. Use our final column IsValid
and things that match that go out the Output 1 path and the default/unmatched rows head to your "needs attention/failed validation" destination.