Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MySQL

Posted May 6, 2003

Deleting Duplicate Rows in a MySQL Database - Page 2

By Ian Gilfillan

Deleting duplicate records when multiple duplicates exist

Of course, not all of you are going to be so lucky as to only have one duplicate, and manually removing them one by one is hardly an answer. First add some more duplicates to the table, and then we will see what we can do:

INSERT INTO bad_table(id,name) VALUES
(1,'Things Fall Apart'),
(2,'The Famished Road'),
(3,'Thirteen cents'),
(3,'Thirteen cents');

Some of you may have read the article on Database Journal about removing duplicates, specifically for SQL Server databases (if you've stumbled across this article as a SQL Server user, you can find it here. Unfortunately none of the techniques mentioned in that article work with MySQL. You cannot use cursors, correlated subqueries or derived tables. But you can use temporary tables, although the way they work in MySQL is different to that which is described in that article. So, here's how you do it - first create a temporary table with the same structure as your existing table (MySQL 4.1 has a great feature - the LIKE keyword that allows you to create a new table based upon the structure of an existing table, but since it's only an alpha release most of you won't be running that yet).

CREATE TEMPORARY TABLE 
bad_temp(id INT,name VARCHAR(20))
TYPE=HEAP;

Note that a temporary table only exists for as long as the connection. Therefore, if you are using a web application such as PHPMyAdmin to test these SQL statements, you'll need to bundle the above statement with the next few, to ensure the temporary table is available all of the time (each web page in a browser-based tool such as PHPMyAdmin uses its own connection). Also, see the cautionary after the example for some things you need to be aware of when using this technique on a production server.

INSERT INTO bad_temp(id,name) SELECT DISTINCT id,name FROM bad_table;

[[Authors note: A reader has pointed out that I went out about this in a rather long-winded way. You can just as easily have used a single statement:

CREATE TEMPORARY TABLE bad_temp AS SELECT DISTINCT * FROM bad_table
Thanks to P-A Fredriksson for pointing this out!]]

Now the temporary table contains a set of data with no duplicates. The DISTINCT keyword takes care of that. Now, delete all of the data from the original (and read the cautionary below first when trying this on a production server!), and populate it from the temporary table, as follows:

DELETE FROM bad_table;

INSERT INTO bad_table(id,name) SELECT id,name FROM bad_temp;


MySQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MySQL Forum
Topic By Replies Updated
MySQL in high availability and transction secure banking application klamor 1 August 28th, 10:24 AM
MySQL rollback UAL225 1 August 28th, 10:15 AM
Browsing a DB file that uses MySql finleytech 1 July 26th, 10:51 AM
php cookie won't pass variable to next page ITdevGirl 0 June 22nd, 12:13 PM