How to remove duplicate entries from a mysql db?
Solution 1:
This command adds a unique key, and drops all rows that generate errors (due to the unique key). This removes duplicates.
ALTER IGNORE TABLE table ADD UNIQUE KEY idx1(title);
Edit: Note that this command may not work for InnoDB tables for some versions of MySQL. See this post for a workaround. (Thanks to "an anonymous user" for this information.)
Solution 2:
Create a new table with just the distinct rows of the original table. There may be other ways but I find this the cleanest.
CREATE TABLE tmp_table AS SELECT DISTINCT [....] FROM main_table
More specifically:
The faster way is to insert distinct rows into a temporary table. Using delete, it took me a few hours to remove duplicates from a table of 8 million rows. Using insert and distinct, it took just 13 minutes.
CREATE TABLE tempTableName LIKE tableName;
CREATE INDEX ix_all_id ON tableName(cellId,attributeId,entityRowId,value);
INSERT INTO tempTableName(cellId,attributeId,entityRowId,value) SELECT DISTINCT cellId,attributeId,entityRowId,value FROM tableName;
DROP TABLE tableName;
INSERT tableName SELECT * FROM tempTableName;
DROP TABLE tempTableName;