Select statement to find duplicates on certain fields
Can you help me with SQL statements to find duplicates on multiple fields?
For example, in pseudo code:
select count(field1,field2,field3)
from table
where the combination of field1, field2, field3 occurs multiple times
and from the above statement if there are multiple occurrences I would like to select every record except the first one.
To get the list of fields for which there are multiple records, you can use..
select field1,field2,field3, count(*)
from table_name
group by field1,field2,field3
having count(*) > 1
Check this link for more information on how to delete the rows.
http://support.microsoft.com/kb/139444
There should be a criterion for deciding how you define "first rows" before you use the approach in the link above. Based on that you'll need to use an order by clause and a sub query if needed. If you can post some sample data, it would really help.
You mention "the first one", so I assume that you have some kind of ordering on your data. Let's assume that your data is ordered by some field ID
.
This SQL should get you the duplicate entries except for the first one. It basically selects all rows for which another row with (a) the same fields and (b) a lower ID exists. Performance won't be great, but it might solve your problem.
SELECT A.ID, A.field1, A.field2, A.field3
FROM myTable A
WHERE EXISTS (SELECT B.ID
FROM myTable B
WHERE B.field1 = A.field1
AND B.field2 = A.field2
AND B.field3 = A.field3
AND B.ID < A.ID)