Duplicate entries in database table

Certified Associate Developer

Hi,

I have a situation where there is a need to identify the duplicate entries in a database table and delete the duplicate rows. How can I do this?

  Discussion posts and replies are publicly visible

Parents
  • 0
    Certified Senior Developer

    DELETE 
        FROM
        employee
    WHERE
        id IN(
        SELECT
            id
        FROM
            (
            SELECT
                id,
                RANK() OVER(
                PARTITION BY firstname,
                middlename,
                lastname
            ORDER BY
                id
            ) my_rank
        FROM
            employee) AS c
            WHERE
                c.my_rank > 1
        );

    you can use this MySQL code to delete the duplicate row in my employee table I have 4 columns id(PrimaryKey),firstname,middlename,lastname. Above mention query will delete all the duplicate rows but will keep one entry of each unique row.

    Note this query can partition on the basis of one column or multiple columns or all the columns depending on your requirement change the table name and column names used in the partition. if you want to remove all the duplicates then no need to apply where clause c.my_rank>1

Reply
  • 0
    Certified Senior Developer

    DELETE 
        FROM
        employee
    WHERE
        id IN(
        SELECT
            id
        FROM
            (
            SELECT
                id,
                RANK() OVER(
                PARTITION BY firstname,
                middlename,
                lastname
            ORDER BY
                id
            ) my_rank
        FROM
            employee) AS c
            WHERE
                c.my_rank > 1
        );

    you can use this MySQL code to delete the duplicate row in my employee table I have 4 columns id(PrimaryKey),firstname,middlename,lastname. Above mention query will delete all the duplicate rows but will keep one entry of each unique row.

    Note this query can partition on the basis of one column or multiple columns or all the columns depending on your requirement change the table name and column names used in the partition. if you want to remove all the duplicates then no need to apply where clause c.my_rank>1

Children