The Distinct Subsequences II interview question asks you to find the total number of unique non-empty subsequences of a single string s. Because the string can contain duplicate characters, different sets of indices might result in the same subsequence (e.g., in "aba", deleting the first 'a' or the last 'a' results in "ba"). You need to return the count modulo .
Amazon and Google ask this "Hard" problem to evaluate your ability to handle duplicate counting in Dynamic Programming. It evaluation whether you can derive a formula that subtracts over-counted states. It’s a sophisticated test of combinatorial logic and modular arithmetic.
This is a 1D Dynamic Programming problem with a twist.
dp[i] be the number of distinct subsequences using the first characters.s[i] is new: dp[i] = 2 * dp[i-1] + 1 (Each existing subsequence can either include or exclude the new char, plus the char itself).s[i] has appeared before at index j: dp[i] = 2 * dp[i-1] - dp[j-1]. We subtract dp[j-1] because all subsequences formed by adding the previous occurrence of this character are now being duplicated by the current occurrence.dp[n].String: "aba"
dp[1] = {a} (Count 1).dp[2] = {a, b, ab} (Count ).{aa, ba, aba}.{a, b, ab}.last_occurrence_dp.
Final count for "aba" is 6: "a", "b", "aa", "ab", "ba", "aba".(a - b + mod) % mod.This problem is a variation of the "Counting Subsets" problem but with duplicates. Whenever you need to count "unique" things in a sequence with duplicates, think about storing the "last time this contribution was added" to avoid double-counting. This is a common Dynamic Programming interview pattern.