Magicsheet logo

Longest Common Subsequence

Medium
53.4%
Updated 6/1/2025

Longest Common Subsequence

What is this problem about?

The Longest Common Subsequence (LCS) problem gives you two strings, text1 and text2, and asks you to find the length of their longest common subsequence. A subsequence is a sequence generated from a string by deleting some (or no) characters without changing the relative order of the remaining characters. Unlike a substring, a subsequence does not need to be contiguous. For example, "ace" is a subsequence of "abcde".

Why is this asked in interviews?

LCS is the quintessential Dynamic Programming (DP) interview question. It is highly favored by interviewers because it serves as the foundational building block for many other complex string comparison algorithms, such as the ones used in git diff or DNA sequencing. It tests a candidate's ability to break a problem down into overlapping subproblems and build a 2D matrix to store intermediate results.

Algorithmic pattern used

This problem is solved using the Dynamic Programming pattern, specifically a 2D DP matrix. You create a grid where the rows represent characters of text1 and columns represent characters of text2. As you iterate through the grid, if the characters at the current indices match, the LCS length is 1+1 + the value from the diagonal top-left cell. If they don't match, you take the maximum value from the cell directly above or directly to the left.

Example explanation

Let text1 = "abcde" and text2 = "ace". We initialize a 2D matrix dp padded with zeros.

  • Compare 'a' and 'a': Match! dp[i][j] = 1 + dp[i-1][j-1].
  • Compare 'b' and 'a': Mismatch. We take max(dp[i-1][j], dp[i][j-1]). As we fill out the grid, we trace the matches:
  1. 'a' matches 'a' (length 1)
  2. 'c' matches 'c' (length 2)
  3. 'e' matches 'e' (length 3) The 'b' and 'd' in text1 are essentially skipped because inheriting the maximum previous length effectively "deletes" them from consideration. The final cell in the bottom-right of the matrix contains 3, the maximum length.

Common mistakes candidates make

A very common mistake is confusing a subsequence with a substring. If a candidate tries to enforce contiguous matching (resetting the count to zero upon a mismatch), they will fail the test cases. Another mistake is using pure recursion without memoization, which results in an exponential O(2N)O(2^N) time complexity and a massive call stack that will Time Limit Exceed.

Interview preparation tip

When practicing the Longest Common Subsequence coding problem, don't just memorize the 2D array solution. Challenge yourself to optimize the space complexity. Since calculating the current row of the DP matrix only requires the values from the previous row, you can solve this using just a 1D array, reducing the space complexity from O(M×N)O(M \times N) to O(N)O(N). This optimization is a huge plus in technical interviews.

Similar Questions