Free Newsletters:
DatabaseDaily  
Database Journal
Search Database Journal:
 
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» DBA Jobs
» Sitemap

News Via RSS Feed


follow us on Twitter





Brocade Doubles Down on 16 Gbps Fibre Channel

Microsoft Wants iOS Apps to Run on WP7

Avaya Debuts New Virtual Services Switch
Database Journal |DBA Support |SQLCourse |SQLCourse2







Technical Specialist – Pre-sales (MA)
Next Step Systems
US-MA-Littleton

Justtechjobs.com Post A Job | Post A Resume

Featured Database Articles

MySQL

May 6, 2003

Deleting Duplicate Rows in a MySQL Database - Page 2

By Ian Gilfillan

Deleting duplicate records when multiple duplicates exist

Of course, not all of you are going to be so lucky as to only have one duplicate, and manually removing them one by one is hardly an answer. First add some more duplicates to the table, and then we will see what we can do:

INSERT INTO bad_table(id,name) VALUES
(1,'Things Fall Apart'),
(2,'The Famished Road'),
(3,'Thirteen cents'),
(3,'Thirteen cents');

Some of you may have read the article on Database Journal about removing duplicates, specifically for SQL Server databases (if you've stumbled across this article as a SQL Server user, you can find it here. Unfortunately none of the techniques mentioned in that article work with MySQL. You cannot use cursors, correlated subqueries or derived tables. But you can use temporary tables, although the way they work in MySQL is different to that which is described in that article. So, here's how you do it - first create a temporary table with the same structure as your existing table (MySQL 4.1 has a great feature - the LIKE keyword that allows you to create a new table based upon the structure of an existing table, but since it's only an alpha release most of you won't be running that yet).

CREATE TEMPORARY TABLE 
bad_temp(id INT,name VARCHAR(20))
TYPE=HEAP;

Note that a temporary table only exists for as long as the connection. Therefore, if you are using a web application such as PHPMyAdmin to test these SQL statements, you'll need to bundle the above statement with the next few, to ensure the temporary table is available all of the time (each web page in a browser-based tool such as PHPMyAdmin uses its own connection). Also, see the cautionary after the example for some things you need to be aware of when using this technique on a production server.

INSERT INTO bad_temp(id,name) SELECT DISTINCT id,name FROM bad_table;

[[Authors note: A reader has pointed out that I went out about this in a rather long-winded way. You can just as easily have used a single statement:

CREATE TEMPORARY TABLE bad_temp AS SELECT DISTINCT * FROM bad_table
Thanks to P-A Fredriksson for pointing this out!]]

Now the temporary table contains a set of data with no duplicates. The DISTINCT keyword takes care of that. Now, delete all of the data from the original (and read the cautionary below first when trying this on a production server!), and populate it from the temporary table, as follows:

DELETE FROM bad_table;

INSERT INTO bad_table(id,name) SELECT id,name FROM bad_temp;

Tools:
Add databasejournal.com to your favorites
Add databasejournal.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed

MySQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 



Latest Forum Threads
MySQL Forum
Topic By Replies Updated
Attendance report Using Mysql pravingate07 0 February 7th, 06:14 AM
Navicat -- import tdetz 0 February 4th, 09:06 AM
inner joins and where nikj12 1 December 18th, 06:16 PM
Advice about software for a total newbie jvocat 2 December 8th, 03:37 PM