Magicsheet logo

Find Maximum Number of Non Intersecting Substrings

Medium
12.5%
Updated 8/1/2025

Find Maximum Number of Non Intersecting Substrings

What is this problem about?

The Find Maximum Number of Non Intersecting Substrings interview question asks you to find the maximum number of non-overlapping substrings that satisfy a specific "self-containment" property. A substring is valid if every character present in it appears only within that substring in the original string. You want to pick the maximum number of such disjoint valid substrings.

Why is this asked in interviews?

Bloomberg use this problem to test your ability to handle Interval and Greedy interview patterns. It requires identifying all potential candidate substrings (intervals) and then solving a variation of the "Interval Scheduling" problem. It evaluations whether you can recognize character dependencies and correctly find the smallest valid range for each character.

Algorithmic pattern used

This problem is solved in two stages:

  1. Candidate Discovery: For each character 'a'-'z', find its first and last occurrence. Expand this range if other characters inside it have occurrences outside. This is identical to finding "Self-Contained Substrings."
  2. Greedy Selection: Once you have a list of all valid candidate intervals, sort them by their end points.
  3. Iterate through the sorted intervals and pick an interval if it does not overlap with the previously picked one.

Example explanation

String: "adefaddaccc"

  1. 'e' range: [2, 2]. Valid.
  2. 'f' range: [3, 3]. Valid.
  3. 'c' range: [8, 10]. Valid.
  4. 'a' range: [0, 7]. Valid. Candidates: [2,2], [3,3], [8,10], [0,7]. Sorted by end: [2,2], [3,3], [0,7], [8,10]. Selection:
  • Pick [2,2].
  • Pick [3,3].
  • [0,7] overlaps with [2,2]. Skip.
  • Pick [8,10]. Total: 3.

Common mistakes candidates make

  • Overlapping candidates: Thinking any range is a candidate. Candidates must be "self-contained."
  • Sorting by start: Sorting by start time doesn't work for maximizing disjoint intervals; you must sort by end time.
  • Expansion failure: Not expanding the range correctly when a new character is discovered inside an existing interval.

Interview preparation tip

The "Interval Scheduling Maximization" problem is a core greedy algorithm. Always sort by the end time to leave as much room as possible for future intervals.

Similar Questions