Magicsheet logo

Select K Disjoint Special Substrings

Medium
12.5%
Updated 8/1/2025

Select K Disjoint Special Substrings

What is this problem about?

The Select K Disjoint Special Substrings interview question asks you to find the maximum number of disjoint (non-overlapping) substrings from a given string, where each selected substring is "special" — meaning each character in the substring appears only within that substring and not anywhere else in the string. You must select at most K such substrings and maximize their count.

Why is this asked in interviews?

Amazon asks this problem because it combines character frequency analysis, greedy interval selection, and dynamic programming — a layered problem that tests multiple skills. The "special" substring constraint (characters appear only within that substring) requires careful character-to-interval mapping using hash tables, while the "disjoint" selection requirement invokes the classic interval scheduling maximization problem.

Algorithmic pattern used

The pattern is hash table + greedy (or DP) interval selection. First, record the first and last occurrence of each character in the string. A substring is "special" if for every character in it, the character's first and last occurrence are both within the substring. Build candidate intervals by merging characters that must co-occur. Then apply greedy interval selection (sort by end position, greedily pick non-overlapping intervals) or DP to select up to K disjoint special substrings.

Example explanation

String: "aabbcc".

Character ranges: a→[0,1], b→[2,3], c→[4,5]. Each range is self-contained (no character spans across ranges). All three are special. Three disjoint special substrings exist: "aa", "bb", "cc". If K=2, select any 2 → answer is 2.

String: "abab": a appears at 0,2; b appears at 1,3. Both ranges span the whole string → they must merge → only 1 candidate interval. If K=2, only 1 disjoint special substring exists.

Common mistakes candidates make

  • Not merging character ranges when a character's span overlaps another — failing to merge leads to invalid (non-special) substrings.
  • Treating K as a minimum rather than a maximum — the goal is to select as many as possible, up to K.
  • Using a greedy by start position instead of end position — interval scheduling must sort by end time for the greedy to be optimal.
  • Confusing "disjoint" substrings with "non-adjacent" — disjoint means non-overlapping, adjacent is allowed.

Interview preparation tip

For the Select K Disjoint Special Substrings coding problem, the hash table and greedy dynamic programming interview pattern applies in two phases: (1) identify special substring intervals using character range merging, (2) greedily select up to K non-overlapping intervals sorted by end position. Amazon interviewers appreciate the parallel to the classic "Activity Selection" problem — mention it explicitly. Practice the character-range-merge pattern as it also appears in Partition Labels, a closely related problem worth studying alongside this one.

Similar Questions