DISTINCT for only one column
Let's say I have the following query.
SELECT ID, Email, ProductName, ProductModel FROM Products
How can I modify it so that it returns no duplicate Emails?
In other words, when several rows contain the same email, I want the results to include only one of those rows (preferably the last one). Duplicates in other columns should be allowed.
Clauses like DISTINCT
and GROUP BY
appear to work on entire rows. So I'm not sure how to approach this.
Solution 1:
If you are using SQL Server 2005 or above use this:
SELECT *
FROM (
SELECT ID,
Email,
ProductName,
ProductModel,
ROW_NUMBER() OVER(PARTITION BY Email ORDER BY ID DESC) rn
FROM Products
) a
WHERE rn = 1
EDIT: Example using a where clause:
SELECT *
FROM (
SELECT ID,
Email,
ProductName,
ProductModel,
ROW_NUMBER() OVER(PARTITION BY Email ORDER BY ID DESC) rn
FROM Products
WHERE ProductModel = 2
AND ProductName LIKE 'CYBER%'
) a
WHERE rn = 1
Solution 2:
This assumes SQL Server 2005+ and your definition of "last" is the max PK for a given email
WITH CTE AS
(
SELECT ID,
Email,
ProductName,
ProductModel,
ROW_NUMBER() OVER (PARTITION BY Email ORDER BY ID DESC) AS RowNumber
FROM Products
)
SELECT ID,
Email,
ProductName,
ProductModel
FROM CTE
WHERE RowNumber = 1
Solution 3:
You can over that by using GROUP BY
like this:
SELECT ID, Email, ProductName, ProductModel
FROM Products
GROUP BY Email
Solution 4:
When you use DISTINCT
think of it as a distinct row, not column. It will return only rows where the columns do not match exactly the same.
SELECT DISTINCT ID, Email, ProductName, ProductModel
FROM Products
----------------------
1 | [email protected] | ProductName1 | ProductModel1
2 | [email protected] | ProductName1 | ProductModel1
The query would return both rows because the ID
column is different. I'm assuming that the ID
column is an IDENTITY
column that is incrementing, if you want to return the last then I recommend something like this:
SELECT DISTINCT TOP 1 ID, Email, ProductName, ProductModel
FROM Products
ORDER BY ID DESC
The TOP 1
will return only the first record, by ordering it by the ID
descending it will return the results with the last row first. This will give you the last record.