Magicsheet logo

Distinct Echo Substrings

Hard
25%
Updated 8/1/2025

Distinct Echo Substrings

What is this problem about?

The Distinct Echo Substrings interview question asks you to find the number of unique substrings within a given string that can be represented as two identical concatenated strings (e.g., "abcabc" or "aaaa"). These are known as "echo" or "square" substrings. You need to return the count of distinct strings that meet this criteria, not the total number of occurrences.

Why is this asked in interviews?

Google uses this "Hard" problem to test a candidate's knowledge of string optimization and hashing. A naive O(n3)O(n^3) approach (checking every possible substring) will time out for strings of length 2000. It evaluation your ability to use the Rolling Hash interview pattern or Suffix Structures to compare substrings in constant time. It tests whether you can balance the complexity of matching patterns with the need to maintain a set of unique results.

Algorithmic pattern used

This problem is best solved using Rolling Hash or String Hashing.

  1. Iterate through all possible lengths LL of the repeated part (from 1 to n/2n/2).
  2. Slide a window of length 2L2L through the string.
  3. For each window, check if the hash of the first half (index ii to i+L1i+L-1) is equal to the hash of the second half (index i+Li+L to i+2L1i+2L-1).
  4. If they match, add the substring to a HashSet to ensure only distinct echo substrings are counted.
  5. Using precomputed prefix hashes and powers of the base, each comparison takes O(1)O(1) time, leading to an overall O(n2)O(n^2) complexity.

Example explanation

String: "abcabcabc"

  1. Length L=3L=3:
    • Window "abcabc": first half "abc", second half "abc". Match! Add "abcabc" to set.
    • Window "bcabca": first half "bca", second half "bca". Match! Add "bcabca" to set.
    • Window "cabcab": first half "cab", second half "cab". Match! Add "cabcab" to set.
  2. Length L=1L=1:
    • No "aa", "bb", or "cc" substrings. Result: 3 distinct echo substrings.

Common mistakes candidates make

  • Hash Collisions: Using a single hash with a small modulo, which might result in incorrect matches. Using a 64-bit hash or double hashing is safer.
  • Brute Force String Comparison: Using s.substring(i, i+L).equals(s.substring(i+L, i+2L)), which makes the complexity O(n3)O(n^3).
  • Overcounting: Not using a HashSet to store the actual strings (or their hashes) to filter out duplicates.

Interview preparation tip

Practice implementing a Rolling Hash. It is a versatile tool for problems involving repeated patterns, palindromes, or substring matching. Understanding how to calculate the hash of any range in O(1)O(1) using prefix sums is a high-level skill that impresses interviewers.

Similar Questions