SQL query that gives distinct results that match multiple columns

Sorry, I couldn't provide a better title for my problem as I am quite new to SQL. I am looking for a SQL query string that solves the below problem.

Let's assume the following table:

DOCUMENT_ID |     TAG
----------------------------
   1        |   tag1
   1        |   tag2
   1        |   tag3
   2        |   tag2
   3        |   tag1
   3        |   tag2
   4        |   tag1
   5        |   tag3

Now I want to select all distinct document id's that contain one or more tags (but those must provide all specified tags). For example: Select all document_id's with tag1 and tag2 would return 1 and 3 (but not 4 for example as it doesn't have tag2).

What would be the best way to do that?

Regards, Kai


Solution 1:

SELECT document_id
FROM table
WHERE tag = 'tag1' OR tag = 'tag2'
GROUP BY document_id
HAVING COUNT(DISTINCT tag) = 2

Edit:

Updated for lack of constraints...

Solution 2:

This assumes DocumentID and Tag are the Primary Key.

Edit: Changed HAVING clause to count DISTINCT tags. That way it doesn't matter what the primary key is.

Test Data

-- Populate Test Data
CREATE TABLE #table (
  DocumentID varchar(8) NOT NULL, 
  Tag varchar(8) NOT NULL
)

INSERT INTO #table VALUES ('1','tag1')
INSERT INTO #table VALUES ('1','tag2')
INSERT INTO #table VALUES ('1','tag3')
INSERT INTO #table VALUES ('2','tag2')
INSERT INTO #table VALUES ('3','tag1')
INSERT INTO #table VALUES ('3','tag2')
INSERT INTO #table VALUES ('4','tag1')
INSERT INTO #table VALUES ('5','tag3')

INSERT INTO #table VALUES ('3','tag2')  -- Edit: test duplicate tags

Query

-- Return Results
SELECT DocumentID FROM #table
WHERE Tag IN ('tag1','tag2')
GROUP BY DocumentID
HAVING COUNT(DISTINCT Tag) = 2

Results

DocumentID
----------
1
3

Solution 3:

select DOCUMENT_ID
      TAG in ("tag1", "tag2", ... "tagN")
   group by DOCUMENT_ID
   having count(*) > N and 

Adjust N and the tag list as needed.