Magicsheet logo

Edit Distance

Medium
48.3%
Updated 6/1/2025

Edit Distance

What is this problem about?

The Edit Distance coding problem (also known as Levenshtein distance) asks for the minimum number of operations required to convert one string into another. You are allowed three operations: insert a character, delete a character, or replace a character. Each operation has a cost of 1. The goal is to transform word1 into word2 using the fewest possible moves.

Why is this asked in interviews?

This is one of the most famous dynamic programming interview pattern problems, asked by tech giants like Google, Amazon, and Microsoft. It tests a candidate's ability to identify overlapping subproblems and define a clear state transition. It has real-world applications in spell checkers, DNA sequence alignment, and natural language processing.

Algorithmic pattern used

The problem is solved using 2D Dynamic Programming.

  1. Define dp[i][j] as the minimum operations to convert the prefix word1[0...i-1] to word2[0...j-1].
  2. Base cases:
    • To convert an empty string to a string of length j, you need j insertions.
    • To convert a string of length i to an empty string, you need i deletions.
  3. Transition:
    • If word1[i-1] == word2[j-1], no operation is needed: dp[i][j] = dp[i-1][j-1].
    • Otherwise, take the minimum of:
      • dp[i-1][j] + 1 (Delete)
      • dp[i][j-1] + 1 (Insert)
      • dp[i-1][j-1] + 1 (Replace)

Example explanation

Convert "horse" to "ros":

  1. Compare 'h' and 'r'. Not equal. Min(Insert, Delete, Replace).
  2. If we replace 'h' with 'r', we move to converting "orse" to "os".
  3. Eventually, we might delete the 'r' and 'e' from "horse". The DP table systematically calculates these costs, finding that the minimum operations are 3:
  • "horse" -> "rorse" (replace 'h' with 'r')
  • "rorse" -> "rose" (remove 'r')
  • "rose" -> "ros" (remove 'e')

Common mistakes candidates make

  • Incorrect Table Size: Forgetting that the table needs to be (m+1) x (n+1) to account for empty string states.
  • Wrong Transition Logic: Confusing the indices for insertion vs. deletion.
  • Space Complexity: Using a full 2D matrix when only two rows are needed to calculate the current state (O(min(M, N)) space optimization).

Interview preparation tip

Be prepared to explain the "intuition" behind each choice in the min() function. For example, why does dp[i][j-1] represent an insertion? (Because you've already matched word1[0...i] with word2[0...j-1], so you just "insert" the character word2[j] into word1).

Similar Questions