Magicsheet logo

Delete Duplicate Emails

Easy
87.6%
Updated 6/1/2025

Delete Duplicate Emails

What is this problem about?

The Delete Duplicate Emails interview question is a classic SQL data cleaning task. You are given a Person table with Id and Email. You need to delete all duplicate email entries, keeping only the one with the smallest Id. This Delete Duplicate Emails coding problem is a test of your ability to write DELETE statements with subqueries or self-joins.

Why is this asked in interviews?

Microsoft and Amazon use this to test basic database maintenance skills. It evaluates if you know how to use a table alias to compare a table with itself and whether you can write a WHERE clause that targets specific rows for removal based on an aggregate condition.

Algorithmic pattern used

This follows the Database interview pattern.

  1. Self-Join approach: DELETE p1 FROM Person p1, Person p2 WHERE p1.Email = p2.Email AND p1.Id > p2.Id. This deletes the row with the larger ID if a smaller ID with the same email exists.
  2. Subquery approach: Identify the MIN(Id) for each email and delete any row whose ID is not in that list.

Example explanation

Table:

  1. The query finds that ID 1 and ID 3 have the same email.
  2. Since 3 > 1, ID 3 is targeted for deletion.
  3. ID 2 is unique and stays. Final Table: IDs 1 and 2.

Common mistakes candidates make

  • SELECT instead of DELETE: Forgetting that the problem asks to modify the table, not just return a view.
  • Keeping the wrong ID: Keeping the largest ID instead of the smallest.
  • Subquery Limitations: In MySQL, you cannot delete from a table if you are selecting from it in a subquery without using an intermediate temporary table or a join.

Interview preparation tip

Always clarify which database dialect is being used (MySQL, PostgreSQL, SQL Server). The syntax for DELETE with joins varies slightly, and knowing these nuances shows professional experience.

Similar Questions