Trying to only return distinct ID entries from a table where all values of a certain column must be matched

So the Situation is that I have 3 Tables, 2 of which are "helper" tables while the other one is the main table I'm trying to get distinct IDs out of:

Main Table dbo.recipes has columns ID, Name and some others such as:

ID  NAME
5   Veggie Cassola
6   Mozzarella Penne
7   Wiener Schnitzel with Fries
8   Grilled Salmon with Rice
9   Greek style Salad

The helpers are dbo.stock:

ID_USER     ID_INGREDIENT
1           225
1           585
1           607
1           643
1           763
1           874
1           937
1           959
1           960
2           225
2           246
2           331
2           363
2           511
2           585

and dbo.content:

ID_INGREDIENT   ID_RECIPE
98              5
196             5
333             5
607             5
608             5
613             5
627             5
643             5
763             5
874             5
951             5
956             5
225             6
585             6
607             6

Basically the dbo.stock is the inventory of ingredients that the user has hence user id and ingredient id. The dbo.content is the ingredients needed to make a certain dish.

What I want to query is ONLY recipes that the user actually has the ingredients for, so that means that all the recipes which have ALL their ingredients matched (to a certain user) should be returned. The code that I have at the moment for my procedure is as follows:

SELECT * FROM [dbo].[recipe]
    WHERE [recipe].[id] NOT IN
    (SELECT DISTINCT [content].[id_recipe] FROM [dbo].[content]
        WHERE [content].[id_ingredient] NOT IN
            (SELECT [stock].[id_ingredient] FROM [dbo].[stock]
                WHERE [stock].[id_user] = @userID))

which works, but I doubt this is the best way to achieve this. Is there a better way to reach the same?

MS SQL Server Express 2019


Solution 1:

Basically, you want to find all the recipes where there isn't an ingredient in content that is not in stock. It's not the way you think about it in English, but it leads to this if you think about it that way in SQL:

DECLARE @userID int = 1;

SELECT ID, NAME
FROM dbo.recipe AS r
WHERE NOT EXISTS
( 
  SELECT id_ingredient FROM dbo.content WHERE id_recipe = r.ID
  EXCEPT
  SELECT id_ingredient FROM dbo.stock WHERE id_user = @userID
);

However, this query is more along the lines of yours, just without the expensive DISTINCTs found in both of the above plans (EXCEPT is sneaky like that), so is probably the best option:

DECLARE @userID int = 1;

SELECT ID, NAME
FROM dbo.recipe AS r
WHERE NOT EXISTS
( 
  SELECT 1 FROM dbo.content AS c
  WHERE id_recipe = r.ID AND NOT EXISTS 
  (
    SELECT 1 FROM dbo.stock 
    WHERE id_ingredient = c.id_ingredient
      AND id_user = @userID
  )
);
  • Example db<>fiddle

Solution 2:

This is a classic Relational Division With Remainder question.

@AaronBertrand has given you a couple good solutions. Here is another one that is often used.

DECLARE @userID int = 1;

SELECT
  r.Id,
  r.Name
FROM dbo.recipe AS r
JOIN dbo.content AS c ON c.id_recipe = r.ID
LEFT JOIN dbo.stock AS s ON s.id_ingredient = c.id_ingredient
    AND s.id_user = @userID
GROUP BY
  r.Id,
  r.Name
HAVING COUNT(*) = COUNT(s.id_ingredient);

This will join everything together (left-joining the stock), group by the recipe and return only those groupings that have the same number of rows as there are non-null stock rows. In other words, every content must match, and there must be at least one content.

There is a semantic difference: if you wanted also all recipes which have no `content, you can change it slightly.

DECLARE @userID int = 1;

SELECT
  r.Id,
  r.Name
FROM dbo.recipe AS r
LEFT JOIN dbo.content AS c ON c.id_recipe = r.ID
LEFT JOIN dbo.stock AS s ON s.id_ingredient = c.id_ingredient
    AND s.id_user = @userID
GROUP BY
  r.Id,
  r.Name
HAVING COUNT(c.id_recipe) = COUNT(s.id_ingredient);

db<>fiddle