The Duplicate Emails interview question is a fundamental database query task. You are given a table named Person with columns Id and Email. Your goal is to write a SQL query that identifies all email addresses that appear more than once in the table. This is a common data validation task used to ensure data integrity or identify user accounts with multiple entries.
This question is frequently used by companies like Microsoft and Meta to test basic SQL knowledge. It evaluates a candidate's understanding of database interview pattern concepts like grouping, aggregation, and filtering. Specifically, it checks if you know how to use the GROUP BY and HAVING clauses together, which is essential for any backend or data engineering role.
This is a standard SQL Aggregation problem.
GROUP BY Email to group rows with the same email.COUNT() aggregate function to count the number of occurrences in each group.HAVING clause to filter for groups where the count is strictly greater than 1.
SELECT Email FROM Person GROUP BY Email HAVING COUNT(Email) > 1;Suppose the Person table looks like this:
| Id | |
|---|---|
| 1 | a@b.com |
| 2 | c@d.com |
| 3 | a@b.com |
a@b.com (2 rows) and c@d.com (1 row).HAVING COUNT(Email) > 1 filter excludes c@d.com.a@b.com.WHERE COUNT(Email) > 1, which fails because WHERE filters rows before grouping, while HAVING filters groups after aggregation.GROUP BY clause without using an aggregate function.GROUP BY is sufficient.Always remember the order of operations in SQL: FROM -> JOIN -> WHERE -> GROUP BY -> HAVING -> SELECT -> ORDER BY. Understanding that HAVING is specifically for aggregated results is key to passing database-focused interviews.
| Title | Difficulty | Topics | LeetCode |
|---|---|---|---|
| Project Employees I | Easy | Solve | |
| Fix Names in a Table | Easy | Solve | |
| Not Boring Movies | Easy | Solve | |
| Primary Department for Each Employee | Easy | Solve | |
| Queries Quality and Percentage | Easy | Solve |