Select group of rows that match all items in a list
Assume I have two tables:
cars
– list of cars
carname | modelnumber | ...
passedtest
– contains every test that a car passed:
id | carname | testtype | date | ...
1 | carA | A | 2000 |
2 | carB | C | 2000 |
3 | carC | D | 2001 |
4 | carA | C | 2002 |
Now, how can I select a car from the passedtest
table that passed all tests (A, B, C, D)?
I tried the IN
statement but it also matches cars that pass even one test. I am looking for a statement to match all values in a list across all rows.
Solution 1:
How about this?
SELECT carname
FROM PassedTest
GROUP BY carname
HAVING COUNT(DISTINCT testtype) = 4
You can also use it as an inner statement for taking info from the cars
table:
SELECT *
FROM cars
WHERE carname IN (
SELECT carname
FROM PassedTest
GROUP BY carname
HAVING COUNT(DISTINCT testtype) = 4
)
Solution 2:
This type of problem is called Relational Division
.
SELECT a.*
FROM Cars a
INNER JOIN
(
SELECT CarName
FROM PassedTest
WHERE testType IN ('A', 'B', 'C', 'D')
GROUP BY CarName
HAVING COUNT(*) = 4
) b ON a.CarName = b.CarName
if a UNIQUE
constraint was not enforce on TestType
for every CarName
on table PassedTest
a DISTINCT
keyword is required on COUNT()
so it will only count unique values.
SELECT a.*
FROM Cars a
INNER JOIN
(
SELECT CarName
FROM PassedTest
WHERE testType IN ('A', 'B', 'C', 'D')
GROUP BY CarName
HAVING COUNT(DISTINCT TestType) = 4
) b ON a.CarName = b.CarName
- SQL of Relational Division
but if you are only interested on the CARNAME
then you don't need to join the tables. Querying on table PassedTest
will suit your needs.
SELECT CarName
FROM PassedTest
WHERE testType IN ('A', 'B', 'C', 'D')
GROUP BY CarName
HAVING COUNT(*) = 4
Solution 3:
You want to perform relational division, an operation that is not implemented in SQL. Here is an example where we have a product-supplier table and a required-products table:
CREATE TABLE product_supplier (
product_id int NOT NULL,
supplier_id int NOT NULL,
UNIQUE (product_id, supplier_id)
);
INSERT INTO product_supplier (product_id, supplier_id) VALUES
(1, 1),
(2, 1),
(3, 1),
(1, 2),
(2, 2),
(3, 2),
(4, 2),
(2, 3),
(3, 3),
(4, 3);
CREATE TABLE reqd (
product_id int NOT NULL,
UNIQUE (product_id)
);
INSERT INTO reqd (product_id) VALUES
(1),
(2),
(3);
... and we want to find all suppliers that supply ALL required products and perhaps others. The result in the above example would be supplier 1 and 2.
The most straight forward solution is this:
SELECT product_supplier.supplier_id
FROM product_supplier
LEFT JOIN reqd ON product_supplier.product_id = reqd.product_id
GROUP BY product_supplier.supplier_id
HAVING COUNT(reqd.product_id) = (SELECT COUNT(*) FROM reqd);
+-------------+
| supplier_id |
+-------------+
| 1 |
| 2 |
+-------------+
And if we want to find all suppliers that supply ALL required products and no others (exact division/no remainder) then add one more condition to the above:
SELECT product_supplier.supplier_id
FROM product_supplier
LEFT JOIN reqd ON product_supplier.product_id = reqd.product_id
GROUP BY product_supplier.supplier_id
HAVING COUNT(reqd.product_id) = (SELECT COUNT(*) FROM reqd)
AND COUNT(product_supplier.product_id) = (SELECT COUNT(*) FROM reqd);
+-------------+
| supplier_id |
+-------------+
| 1 |
+-------------+
An alternate solution is to rephrase the problem: select suppliers where a required product does not exist that does not exist in the products supplied by the supplier. Hmmm:
SELECT DISTINCT supplier_id
FROM product_supplier AS ps1
WHERE NOT EXISTS (
SELECT *
FROM reqd
WHERE NOT EXISTS (
SELECT *
FROM product_supplier AS ps2
WHERE ps1.supplier_id = ps2.supplier_id AND ps2.product_id = reqd.product_id
)
);
+-------------+
| supplier_id |
+-------------+
| 1 |
| 2 |
+-------------+