Mysql Duplicate Rows ( Duplicate detected using 2 columns )
How to remove duplicated in this setup?
id A B
----------------
1 apple 2
2 orange 1
3 apple 2
4 apple 1
In here I want to remove (apple,2) which occurs twice. The id numbers are unique. I would use DISTINCT keyword if it were not. Can I some how make a key out of columns A and B and then use the DISTINCT keyword on that to get what I need ? Many thanks for your replies.
delete from myTable
where id not in
(select min(id)
from myTable
group by A, B)
i.e. the select in brackets returns the first id for each grouping of A and B; deleting all ids that are not in this set will remove all occurences of an A-plus-B combination that are "subsequent" to its first occurrence.
EDIT: this syntax seems to be problematic: see bug report:
http://bugs.mysql.com/bug.php?id=5037
A possible workaround is to do this:
delete from myTable
where id not in
(
select minid from
(select min(id) as minid from myTable group by A, B) as newtable
)
Yet another (from http://labs.creativecommons.org/2010/01/12/removing-duplicate-rows-in-mysql/). Add a unique index then delete it:
ALTER IGNORE TABLE mytable ADD UNIQUE INDEX tmpindex (A,B);
ALTER TABLE mytable DROP INDEX tmpindex;
The IGNORE
keyword is a mysql extension that makes it drop rows that violate the UNIQUE
keyword instead of just failing.
DELETE FROM fruit_table FT1
WHERE EXISTS
(
SELECT * FROM fruit_table FT2
WHERE FT2.fruit_name_column = FT1.fruit_name_column
AND FT2.fruit_integer_column = FT1.fruit_integer_column
AND FT2.id <> FT1.id
)
This assumes you don't care which of the duplicate records is removed.