Magicsheet logo

Design Search Autocomplete System

Hard
52.2%
Updated 6/1/2025

Design Search Autocomplete System

What is this problem about?

The Design Search Autocomplete System coding problem is a sophisticated task where you design a real-time search suggestion engine. As a user types a sentence character by character, the system should return the top 3 historical sentences that start with the current prefix. Suggestions are ranked by frequency, with lexicographical order as a tie-breaker. When the user finishes a sentence (indicated by a '#'), the system records it or updates its frequency.

Why is this asked in interviews?

This is a high-signal "Hard" problem used by Google, Meta, and Amazon. It tests your ability to handle data streams and manage state across multiple function calls. It evaluations your mastery of the Trie interview pattern and your ability to perform top-K filtering using Priority Queues or sorting. It’s a perfect simulation of a real-world feature found in search engines and messaging apps.

Algorithmic pattern used

The core pattern is a Trie (Prefix Tree). Each node in the trie represents a character and can store a map of sentences that pass through it along with their frequencies.

  1. Search: Traverse the Trie based on the prefix. At the current node, retrieve all possible sentences and find the top 3.
  2. Update: When '#' is reached, insert the full sentence into the Trie or increment its count in all nodes along its path.

Example explanation

History: "i love you" (5 times), "island" (3 times), "ironman" (2 times).

  1. User types 'i': The system looks at the 'i' node. It finds "i love you", "island", and "ironman". All three are returned.
  2. User types 's': Prefix is "is". The system moves to the 's' child of 'i'. Only "island" remains. It returns ["island"].
  3. User types '#': The search ends. If the user typed "i smart#", "i smart" is added to the history with frequency 1.

Common mistakes candidates make

  • Slow Retrieval: Trying to perform a full DFS from the current node every time a character is typed. A better way is to store the "top sentences" or a frequency map directly at each node.
  • Memory Management: Forgetting that a Trie can grow very large; in a real interview, you might discuss how to prune or limit the data.
  • State Reseting: Not properly resetting the search pointer when the user finishes a sentence with '#'.

Interview preparation tip

To optimize retrieval, you can store a Map<String, Integer> of frequencies at each Trie node. While this uses more space, it allows you to get suggestions in O(L)O(L) where LL is the prefix length, rather than O(TotalNodes)O(TotalNodes) using DFS.

Similar Questions