The place to BI: SQL: How to select and eliminate duplicate records

lunedì 27 luglio 2015

SQL: How to select and eliminate duplicate records

This article will explain how to select (and then possibly delete) duplicate records in a SQL Server table.

It may happen that in a table there are duplicate records. Find them to delete them is always something boring.

To do this we can use line numbering function ROW_NUMBER () of MS SQL Server.

For example, this query

SELECT ROW_NUMBER() OVER(PARTITION BY VAT_NUMBER ORDER BY CUSTOMER_CODE) AS [RowNumber]
, VAT_NUMBER
, CUSTOMER_CODE
FROM CUSTOMERS

will list all customers, numbering them progressively when have the same Vat ( … PARTITION BY VAT_NUMBER … ).

To highlight duplicates, simply add a WHERE condition in this way:

SELECT * FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY VAT_NUMBER ORDER BY CUSTOMER_CODE) AS [RowNumber]
, VAT_NUMBER
, CUSTOMER_CODE
FROM CUSTOMERS
) AS T1
WHERE RowNumber>1

Likewise, we can use the same trick to delete all duplicates:

DELETE T1 FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY VAT_NUMBER ORDER BY CUSTOMER_CODE) AS [RowNumber]
, VAT_NUMBER
, CUSTOMER_CODE
FROM CUSTOMERS
) AS T1
WHERE T1.RowNumber >1

The place to BI

Pagine

lunedì 27 luglio 2015

SQL: How to select and eliminate duplicate records

Nessun commento:

Posta un commento