SQL query that gives distinct results that match multiple columns
Sorry, I couldn't provide a better title for my problem as I am quite new to SQL. I am looking for a SQL query string that solves the below problem.
Let's assume the following table:
DOCUMENT_ID | TAG ---------------------------- 1 | tag1 1 | tag2 1 | tag3 2 | tag2 3 | tag1 3 | tag2 4 | tag1 5 | tag3
Now I want to select all distinct document id's that contain one or more tags (but those must provide all specified tags). For example: Select all document_id's with tag1 and tag2 would return 1 and 3 (but not 4 for example as it doesn't have tag2).
What would be the best way to do that?
Regards, Kai
Solution 1:
SELECT document_id
FROM table
WHERE tag = 'tag1' OR tag = 'tag2'
GROUP BY document_id
HAVING COUNT(DISTINCT tag) = 2
Edit:
Updated for lack of constraints...
Solution 2:
This assumes DocumentID and Tag are the Primary Key.
Edit: Changed HAVING clause to count DISTINCT tags. That way it doesn't matter what the primary key is.
Test Data
-- Populate Test Data
CREATE TABLE #table (
DocumentID varchar(8) NOT NULL,
Tag varchar(8) NOT NULL
)
INSERT INTO #table VALUES ('1','tag1')
INSERT INTO #table VALUES ('1','tag2')
INSERT INTO #table VALUES ('1','tag3')
INSERT INTO #table VALUES ('2','tag2')
INSERT INTO #table VALUES ('3','tag1')
INSERT INTO #table VALUES ('3','tag2')
INSERT INTO #table VALUES ('4','tag1')
INSERT INTO #table VALUES ('5','tag3')
INSERT INTO #table VALUES ('3','tag2') -- Edit: test duplicate tags
Query
-- Return Results
SELECT DocumentID FROM #table
WHERE Tag IN ('tag1','tag2')
GROUP BY DocumentID
HAVING COUNT(DISTINCT Tag) = 2
Results
DocumentID
----------
1
3
Solution 3:
select DOCUMENT_ID
TAG in ("tag1", "tag2", ... "tagN")
group by DOCUMENT_ID
having count(*) > N and
Adjust N and the tag list as needed.