Magicsheet logo

Find the Missing IDs

Medium
25%
Updated 8/1/2025

Asked by 1 Company

Topics

Find the Missing IDs

What is this problem about?

The Find the Missing IDs coding problem is a database task. You are given a table Customers with an id column. You are told that the IDs should ideally be a continuous sequence from 1 up to the maximum ID present in the table. Your task is to identify all the IDs in that range that are missing from the table.

Why is this asked in interviews?

Amazon and other large-scale e-commerce companies use this to test your proficiency with SQL interview patterns, specifically Recursive Common Table Expressions (CTEs) or Iterative Row Generation. It evaluation whether you know how to generate a reference sequence of numbers on the fly to perform a "Left Join" or "Difference" operation against the actual data. This is a common requirement for auditing and data integrity checks.

Algorithmic pattern used

This is a Recursive SQL Query or Number Generation problem.

  1. Use a Recursive CTE to generate a sequence of integers starting from 1 up to the MAX(id) found in the Customers table.
  2. Perform a LEFT JOIN between this generated sequence and the Customers table on the ID column.
  3. Filter for rows where the Customers.id is NULL.
  4. Alternatively, use the EXCEPT or NOT IN clause to find the difference between the generated sequence and the existing IDs.

Example explanation

Table Customers: [1, 2, 5]

  1. Max ID is 5.
  2. Generate sequence: 1, 2, 3, 4, 5.
  3. Check which are not in the table: 3 and 4 are missing. Result: 3, 4.

Common mistakes candidates make

  • Static Range: Hard-coding the range instead of dynamically finding the maximum ID.
  • Recursive limit: Not knowing that Recursive CTEs often have a default recursion limit (usually 100) and might need a hint/option to handle larger ranges.
  • Efficiency: Forgetting to index the ID column, though in an interview, the focus is on the query logic.

Interview preparation tip

Master Recursive CTEs. They are the standard way in modern SQL (PostgreSQL, MySQL 8+, SQL Server) to solve problems that involve hierarchies or generating sequences. If you're using a database that doesn't support them, mention using a system_calendar or a numbers table.

Similar Questions