Magicsheet logo

Maximum Number of Occurrences of a Substring

Medium
37.1%
Updated 6/1/2025

Maximum Number of Occurrences of a Substring

What is this problem about?

The Maximum Number of Occurrences of a Substring coding problem asks you to find the maximum frequency of any substring that satisfies three conditions: its length is between minLength and maxLength (inclusive), it contains no more than maxLetters distinct characters, and it appears as a substring in the given string s.

Why is this asked in interviews?

Hubspot, Microsoft, Addepar, Meta, Roblox, Atlassian, Yahoo, Salesforce, and Amazon utilize this problem to test a candidate's ability to combine sliding window techniques with hash maps for frequency counting and character distinctness checks. It's a good test of managing constraints efficiently within a string processing context.

Algorithmic pattern used

Sliding Window with Hash Map: Iterate through all possible substrings of length minLength up to maxLength. A more efficient approach uses a fixed-size sliding window, but only needs to consider substrings of length minLength. Why minLength? If a substring of length L (where minLength < L <= maxLength) is valid and appears X times, then any of its prefixes or substrings of length minLength that are also valid (i.e., meet the distinct character count) must appear at least X times. Therefore, we only need to count occurrences of valid substrings of length minLength.

Algorithm:

  1. Initialize a frequency map for substrings and a distinct character count map for the current window.
  2. Use a sliding window of size minLength.
  3. For each window: a. Add the new character to the window, update distinct character count. b. Remove the character exiting the window, update distinct character count. c. If the current window (substring of length minLength) has distinct_characters <= maxLetters, then add it to the substring frequency map.
  4. The maximum value in the substring frequency map is the answer.

Example explanation

s = "aababcaab", maxLetters = 2, minLength = 3, maxLength = 4. Window size = 3 (since we only need to check minLength substrings).

  1. "aab": distinct 'a','b' (count=2 <= maxLetters). Freq: {"aab": 1}.
  2. "aba": distinct 'a','b' (count=2 <= maxLetters). Freq: {"aab": 1, "aba": 1}.
  3. "bab": distinct 'b','a' (count=2 <= maxLetters). Freq: {"aab": 1, "aba": 1, "bab": 1}.
  4. "abc": distinct 'a','b','c' (count=3 > maxLetters). Invalid.
  5. "bca": distinct 'b','c','a' (count=3 > maxLetters). Invalid.
  6. "caa": distinct 'c','a' (count=2 <= maxLetters). Freq: {"aab": 1, "aba": 1, "bab": 1, "caa": 1}.
  7. "aab": distinct 'a','b' (count=2 <= maxLetters). Freq: {"aab": 2, "aba": 1, "bab": 1, "caa": 1}.

Max occurrence = 2 for "aab".

Common mistakes candidates make

  • Checking all lengths from minLength to maxLength: This is inefficient and unnecessary. A valid substring of length L > minLength implies a valid substring of length minLength within it, and that shorter substring will have at least as many occurrences.
  • Inefficient distinct character count: A hash map or frequency array for characters in the current window is essential for O(1) updates.
  • Edge cases with maxLetters: When maxLetters is 0 or 1, ensure the logic handles single-character substrings or no distinct characters correctly.

Interview preparation tip

For the Hash Table Sliding Window String interview pattern, this problem is a classic example of using a fixed-size sliding window (of minLength) to count valid substrings. Focus on efficient window maintenance (adding/removing characters, updating distinct counts) and understanding why checking only minLength substrings is sufficient.

Similar Questions