Magicsheet logo

Smallest Subsequence of Distinct Characters

Medium
89.5%
Updated 6/1/2025

Smallest Subsequence of Distinct Characters

What is this problem about?

The "Smallest Subsequence of Distinct Characters" interview question is a classic challenge that focuses on string manipulation and greedy optimization. The goal is to take a given string and return a new string that contains every unique character from the original input exactly once. However, there is a specific constraint: the resulting string must be the lexicographically smallest possible subsequence among all such possibilities.

In this context, a subsequence is formed by deleting zero or more characters from the original string without changing the order of the remaining characters. For example, if you have duplicate characters like 'b' appearing multiple times, you must decide which 'b' to keep and which ones to discard so that the final sequence of unique characters is as "small" as possible in alphabetical order (e.g., "abc" is smaller than "bac").

Why is this asked in interviews?

This problem is a favorite among top-tier companies like Google, Amazon, and ByteDance because it tests a candidate's ability to balance multiple constraints simultaneously. It isn't just about finding unique characters (which could be done with a simple set); it's about maintaining the relative order while optimizing for the smallest dictionary sequence. It evaluates your understanding of greedy algorithms—where you make the locally optimal choice at each step—and your proficiency with data structures like stacks or monotonic stacks to track and refine your choices as you traverse the string.

Algorithmic pattern used

The primary algorithmic pattern used here is the Monotonic Stack combined with a Greedy approach. As we iterate through the string, we maintain a stack of characters. For each character, we decide whether to add it to the stack. If the current character is smaller than the top of the stack and the character at the top of the stack appears again later in the string, we can safely pop the top character because we know we can pick it up later to achieve a smaller lexicographical result. This "look-ahead" logic is crucial for the greedy strategy to work.

Example explanation

Imagine the input string is "cbacdcbc".

  1. We start with 'c'. Stack: ['c'].
  2. Next is 'b'. 'b' < 'c', and 'c' appears later in the string (at index 3). So, we pop 'c' and push 'b'. Stack: ['b'].
  3. Next is 'a'. 'a' < 'b', and 'b' appears later (at index 5). So, we pop 'b' and push 'a'. Stack: ['a'].
  4. Next is 'c'. It's not in our result yet, so we push it. Stack: ['a', 'c'].
  5. Next is 'd'. Push it. Stack: ['a', 'c', 'd'].
  6. Next is 'c'. 'c' is already in our stack, so we skip it to maintain uniqueness.
  7. Next is 'b'. 'b' < 'd', but 'd' does not appear later. We cannot pop 'd'. Similarly for 'c'. So we just push 'b' if it's not there. The final result for a similar logic would be "acdb".

Common mistakes candidates make

One common error is using a simple sorting or set-based approach, which completely ignores the subsequence order constraint. Another mistake is forgetting to keep track of which characters are already in the stack, leading to duplicate characters in the output. Candidates also often fail to correctly implement the "look-ahead" count (often using a frequency map), resulting in popping characters that never appear again, which makes it impossible to include them in the final unique set.

Interview preparation tip

To master this Smallest Subsequence of Distinct Characters coding problem, practice problems that involve monotonic stacks, such as "Remove K Digits" or "Next Greater Element". Understanding how a stack can maintain a specific order (increasing or decreasing) while allowing for dynamic updates is a powerful tool in your interview pattern toolkit. Always trace your stack-based logic with a pen and paper for complex strings to ensure your greedy choices are sound.

Similar Questions