How to select duplicate rows in a table?
Assume DUMMY_TABLE exists and KEY_COLUMN exists
DUMMY_TABLE has COLUMN_1 and COLUMN_2
SELECT KEY_COLUMN, COLUMN_1, COLUMN_2 from DUMMY_TABLE
GROUP BY KEY_COLUMN, COLUMN_1, COLUMN_2
HAVING COUNT(*) > 1
The important point to note is HAVING because unlike WHERE HAVING filters on aggregation functions.
By executing the above example if any rows are returned then we have duplicate rows. It is simple to find out where the table leak has taken place and then what needs to be done.
Our query will return the list of duplicate rows.
How SQL query works as follow:
GROUP BY clause groups the rows into groups by values KEY_COLUMN, COLUMN_1 and COLUMN_2 columns.
The COUNT(*) returns the number of occurrence for each group (KEY_COLUMN, COLUMN_1, COLUMN_2)
Final HAVING clause only retains DUPLICATE GROUPS. A GROUP having more than one COUNT.
In some databases, ROW_NUMBER() function can be used to find duplicate rows based on KEY_COLUMN, COLUMN_1 and COLUMN_2.
In reality, a UNIQUE constraint is used for a uniqueness for each row with one or many columns of a table. Normally constraint is broken if someone drops it for cleanup or for maintenance. When this happens then we need to find duplicate rows and provide manual delete function, stored procedure or nested SQL query.
Please see the next article for how to delete duplicate rows.
Please visit Business Integration Software to see various products using databases especially Online Exam Software.
Assessment Management Software
Multiple Choice Question Software