Magicsheet logo

Substring with Concatenation of All Words

Hard
29.7%
Updated 6/1/2025

Substring with Concatenation of All Words

What is this problem about?

This "Hard" rated problem asks you to find all starting indices of substrings in s that are a concatenation of every word in a given list words exactly once. All words in the list have the same length. The order of the words in the substring does not matter. For example, if words is ["foo", "bar"], both "foobar" and "barfoo" are valid substrings. This requires finding "permutations" of the word list within a larger string.

Why is this asked in interviews?

This is a classic question at Apple, Microsoft, and Google. It is a complex application of the Sliding Window and Hash Map patterns. It tests your ability to handle multiple constraints: the words have a fixed length, they must all be used, and they must be contiguous. It's a great test of performance optimization, as a naive approach will be too slow. It also assesses your ability to write clean code for a multi-step logic.

Algorithmic pattern used

The optimal approach uses a Sliding Window with Hash Maps.

  1. Create a frequency map of all the words in the words list.
  2. Since all words have length L, you can run the sliding window L times, starting at indices 0, 1, ..., L-1.
  3. In each pass, you move a window of size len(words) * L.
  4. You maintain another hash map of the words found in the current window.
  5. If the current word is not in the original map, reset the window.
  6. If the current word's frequency exceeds the target frequency, shrink the window from the left.
  7. If all words are matched, record the starting index.

Example explanation (use your own example)

s = "barfoobazbitfoobaz", words = ["foo", "baz"] Word length L = 3.

  1. Window size = 2 * 3 = 6. Target map: {"foo": 1, "baz": 1}.
  2. Pass 1 (start at 0): "barfoo" (invalid), "foobaz" (Valid! Index 3), "bazbit" (invalid).
  3. Pass 2 (start at 1): "arfoob" (invalid)...
  4. Pass 3 (start at 2): "rfooba" (invalid)... Wait, "foobaz" starts at index 3. Let's check "foobaz" at the end. At index 12, "foobaz" is another match. The results would be [3, 12].

Common mistakes candidates make

A common mistake is not running the sliding window L times. Many candidates only run it once from index 0, missing potential matches that start at indices 1 or 2. Another mistake is recreating the hash map for every single window, which is very inefficient. Instead, you should add and remove words from a single hash map as the window slides. Finally, failing to handle cases where the string s is shorter than the total length of all words is a common edge case error.

Interview preparation tip

To master the Substring with Concatenation of All Words interview pattern, focus on the "Sliding Window with a Step" logic. The fact that all words are the same length is the key piece of information—use it! Explain to your interviewer how you are using the hash map to keep track of the "remaining words" needed for a match.

Similar Questions