Magicsheet logo

Add Bold Tag in String

Medium
88.6%
Updated 6/1/2025

Add Bold Tag in String

What is this problem about?

In the Add Bold Tag in String coding problem, you are given a main string and a list of dictionary words. You must find all occurrences of these dictionary words within the main string and wrap them in <b> and </b> tags. If two bolded regions overlap or are adjacent, they should be merged into a single tag.

Why is this asked in interviews?

This is a popular Medium difficulty question at Google and Microsoft because it combines string matching with interval merging. It tests your ability to transform a set of distinct "hits" into a clean, formatted output, demonstrating proficiency in both search algorithms and data processing.

Algorithmic pattern used

This problem effectively uses the String Matching interview pattern combined with Interval Merging.

  1. Identify all bolded indices: Create a boolean array isBold of the same length as the main string.
  2. For each dictionary word, find its occurrences in the main string and mark the corresponding isBold indices as true.
  3. Traverse the isBold array to determine where to place the opening and closing tags.

Example explanation

String: "aabbcc", Dictionary: ["aabb", "bbcc"]

  • "aabb" marks indices [0, 1, 2, 3] as bold.
  • "bbcc" marks indices [2, 3, 4, 5] as bold.
  • The combined isBold array is true for all indices 0 through 5.
  • Result: <b>aabbcc</b> (one merged tag instead of <b>aabb</b><b>bbcc</b>).

Common mistakes candidates make

  • Suboptimal Search: Searching for each dictionary word individually using a slow method. Using a Trie or Aho-Corasick can optimize this for large dictionaries.
  • Incorrect Merging: Failing to merge adjacent tags (e.g., <b>a</b><b>b</b> should be <b>ab</b>).
  • String Concatenation: Building the result string by repeatedly adding small pieces, which is inefficient.

Interview preparation tip

Whenever you have overlapping ranges (like bold indices), a boolean mask (an array of true/false) is often easier to manage than a list of start/end pairs.

Similar Questions