Magicsheet logo

Count Number of Homogenous Substrings

Medium
92.4%
Updated 6/1/2025

Asked by 3 Companies

Count Number of Homogenous Substrings

What is this problem about?

A string is "homogenous" if all the characters in it are the same. Given a string s, you need to return the number of homogenous substrings it contains. For example, "aaaa" has substrings "a", "aa", "aaa", and "aaaa" appearing multiple times. Since the result can be large, return it modulo 109+710^9 + 7.

Why is this asked in interviews?

This problem is popular at Virtu Financial and Google because it tests the ability to recognize a triangular number sequence (1,3,6,10,1, 3, 6, 10, \dots) within a string traversal. It evaluations whether a candidate can avoid the O(n2)O(n^2) complexity of generating all substrings and instead use a mathematical formula to count them in O(n)O(n).

Algorithmic pattern used

The pattern is Linear Scan with Math (Triangular Numbers).

  1. Iterate through the string and find the lengths of consecutive identical character segments.
  2. For a segment of length LL, the number of homogenous substrings it contributes is the sum of integers from 1 to LL, which is Limes(L+1)2\frac{L imes (L+1)}{2}.
  3. Sum these values for all segments.

Example explanation

String: "abbcccaa"

  • Segment "a": Length 1. Substrings: 1imes22=1\frac{1 imes 2}{2} = 1.
  • Segment "bb": Length 2. Substrings: 2imes32=3\frac{2 imes 3}{2} = 3.
  • Segment "ccc": Length 3. Substrings: 3imes42=6\frac{3 imes 4}{2} = 6.
  • Segment "aa": Length 2. Substrings: 2imes32=3\frac{2 imes 3}{2} = 3. Total: 1+3+6+3=131 + 3 + 6 + 3 = 13.

Common mistakes candidates make

A frequent mistake is not applying the modulo at each addition, leading to overflow for very long strings. Another is trying to generate and count every substring, which is highly inefficient. Some candidates also struggle with the boundary logic of the segment counting loop (e.g., missing the last segment).

Interview preparation tip

Whenever you need to count all possible subarrays or substrings that satisfy a property, check if that property holds for contiguous blocks. If it does, the formula n(n+1)/2n(n+1)/2 is almost always the key to an optimal solution.

Similar Questions