Magicsheet logo

Sum of Beauty of All Substrings

Medium
89.6%
Updated 6/1/2025

Sum of Beauty of All Substrings

What is this problem about?

In this problem, the "beauty" of a string is defined as the difference between the frequency of the most frequent character and the least frequent character. You are given a string and asked to find the sum of the beauty of all its possible substrings. Characters with zero frequency are ignored when determining the "least frequent" character. For example, in "abaac", the beauty is 3 (for 'a') minus 1 (for 'b' or 'c'), which is 2.

Why is this asked in interviews?

This is a popular medium-level string problem at Microsoft and Meta. It tests a candidate's ability to efficiently manage frequency counts while iterating through substrings. While it might seem like it requires complex data structures, it can be solved by cleverly using nested loops and an array for character counts, demonstrating that the candidate understands the constraints (lowercase English letters only).

Algorithmic pattern used

The pattern is "Substring Iteration with Frequency Tracking." You use two nested loops: the outer loop picks the starting index of the substring, and the inner loop expands the substring one character at a time. Within the inner loop, you maintain a frequency array of size 26. As the substring grows, you update the count of the new character and then find the max and min frequencies currently in the array to calculate beauty.

Example explanation

String: "aab" Substrings:

  • "a": Max=1, Min=1. Beauty = 0.
  • "aa": Max=2, Min=2. Beauty = 0.
  • "aab": Max=2 ('a'), Min=1 ('b'). Beauty = 2-1 = 1.
  • "a": Max=1, Min=1. Beauty = 0.
  • "ab": Max=1, Min=1. Beauty = 0.
  • "b": Max=1, Min=1. Beauty = 0. Total Sum = 1.

Common mistakes candidates make

  1. Handling Zero Frequencies: Including characters that don't appear in the current substring in the "minimum frequency" calculation. You must only consider characters with frequency > 0.
  2. Inefficient Min/Max Calculation: Re-scanning the frequency array more than necessary. While scanning 26 elements is O(1), doing it inside three nested loops (start, end, and 26) can be optimized.
  3. Memory Management: Creating a new frequency map/array for every single substring instead of updating one incrementally as the substring expands.

Interview preparation tip

When dealing with problems that involve character frequencies, always remember that there are only 26 lowercase English letters. This means an array of size 26 often acts as a constant-time hash map, which is much faster than using a generic unordered_map or HashMap.

Similar Questions