Magicsheet logo

Set Mismatch

Easy
48%
Updated 6/1/2025

Set Mismatch

What is this problem about?

The Set Mismatch interview question gives you a set of n integers that was supposed to contain all numbers from 1 to n exactly once. Due to a data error, one number appears twice and one number is missing. Find and return both: [duplicate, missing]. This is a number theory and bit manipulation problem with multiple efficient approaches.

Why is this asked in interviews?

Microsoft, Meta, Amazon, Google, and Bloomberg ask this problem because it has multiple clean solutions — hash map, math (sum formulas), XOR, and cyclic sort — and interviewers can ask for progressively more space-efficient approaches. It tests counting, deduplication, and the mathematical relationship between expected and observed sums. The O(1) space solution using XOR or mathematical equations is particularly valued.

Algorithmic pattern used

Three approaches: hash count, math, and XOR. Hash count: count each number; the one with count 2 is duplicate, the one with count 0 is missing. Math approach: duplicate = sum(nums) - n*(n+1)/2 + missing. Combined with duplicate^2 - missing^2 = (sum_of_squares(nums) - n*(n+1)*(2n+1)/6) to solve for both. XOR approach: XOR all numbers 1..n with all elements; the result has the bits where duplicate and missing differ — split into two groups and XOR separately. Cyclic sort: place each number at index num-1; the misplaced number is the duplicate.

Example explanation

Input: [1, 2, 2, 4] (n=4).

Expected sum: 1+2+3+4 = 10. Actual sum: 9. Difference: 10-9=1. But this alone gives duplicate-missing=1.

Count approach: count 2 appears twice → duplicate=2. Count 3 appears zero → missing=3.

Result: [2, 3].

XOR approach: XOR all elements with 1..4 → xor of (duplicate, missing). Find a set bit, split into two groups, XOR each group to find the two values.

Common mistakes candidates make

  • Using the sum formula alone — sum(nums) - n*(n+1)/2 gives duplicate-missing, not the individual values. Need a second equation.
  • Not handling the case where duplicate==missing (impossible by problem constraints, but good to confirm).
  • In the cyclic sort, not continuing to swap when the value at nums[i]-1 already equals nums[i] (duplicate condition).
  • XOR approach: not correctly identifying the distinguishing bit to split the numbers into two groups.

Interview preparation tip

For the Set Mismatch coding problem, the hash table bit manipulation array interview pattern offers multiple valid solutions. Start with the hash count approach (simplest), then explain the O(1) space XOR or cyclic sort approach. Grammarly and GitHub interviewers appreciate knowing multiple approaches and being able to implement the optimal one. The cyclic sort technique — placing each number at its correct index — is especially reusable: it solves "Find All Duplicates," "First Missing Positive," and several similar problems. Practice cyclic sort as a template.

Similar Questions