Magicsheet logo

Find the Duplicate Number

Medium
53.9%
Updated 6/1/2025

Find the Duplicate Number

What is this problem about?

The Find the Duplicate Number interview question is a classic algorithmic challenge. You are given an array of n+1n+1 integers where each integer is in the range [1,n][1, n]. According to the Pigeonhole Principle, at least one duplicate number must exist. Your task is to find the duplicate number without modifying the array and using only O(1)O(1) extra space.

Why is this asked in interviews?

This Find the Duplicate Number coding problem is a favorite of top tech firms like Google, Microsoft, and Cisco. It tests your ability to look at a problem from multiple perspectives: as a search problem (Binary Search), a bit manipulation problem, or most elegantly, as a graph problem. It evaluates whether you can find a solution that satisfies strict constraints on space and mutation.

Algorithmic pattern used

The most optimal solution uses Floyd's Cycle-Finding Algorithm (Tortoise and Hare).

  1. Graph Mapping: Treat each index ii as a node and nums[i] as a directed edge to node nums[i]. Because there's a duplicate, the "path" will eventually form a cycle.
  2. Phase 1 (Intersection): Move slow by one step and fast by two steps. They will eventually meet inside the cycle.
  3. Phase 2 (Entrance): Move slow back to the start and move both at the same speed. They will meet at the entrance of the cycle, which corresponds to the duplicate number.

Example explanation

Array: [1, 3, 4, 2, 2]

  • Start at index 0. nums[0] = 1.
  • Move: 0 -> 1 -> 3 -> 2 -> 2 -> 2...
  • The cycle starts at '2'.
  • The tortoise and hare will meet, and the entrance phase will point to the value 2.

Common mistakes candidates make

  • Modifying the array: Sorting the array or using "negation marking" (O(1)O(1) space but modifies the input). The problem usually forbids this.
  • Using a Set: Using O(N)O(N) extra space to store seen numbers.
  • Binary Search Errors: Implementing the O(NlogN)O(N \log N) binary search on the range of values incorrectly.

Interview preparation tip

Master the "Graph Linked List" analogy. Any array where the values are within the range of indices can be treated as a functional graph. This is a powerful Two Pointers interview pattern that solves many "Cycle" related questions.

Similar Questions