Magicsheet logo

Groups of Strings

Hard
100%
Updated 6/1/2025

Groups of Strings

What is this problem about?

The Groups of Strings coding problem asks you to group words based on three specific operations. Two strings belong to the same group if you can transform one into the other by: adding a character, deleting a character, or replacing a character. You are given an array of strings (where each string has unique lowercase letters). You need to return the number of groups and the size of the largest group.

Why is this asked in interviews?

Lowe's and other tech companies ask this "Hard" Union Find and Bit Manipulation problem to test your ability to optimize complex relationship graphs. An O(N2)O(N^2) approach comparing every string to every other string is too slow for N=2imes104N=2 imes 10^4. It evaluates if you can use bitmasks to represent sets of characters and perform O(1)O(1) transformations to find neighbors.

Algorithmic pattern used

This is solved using Bitmasking and Disjoint Set Union (DSU).

  1. Convert each string into a 26-bit integer (bitmask).
  2. Store these bitmasks in a Hash Map to keep track of counts/indices.
  3. For each unique bitmask, generate all valid "neighbor" bitmasks:
    • Add/Delete: Flip each of the 26 bits one by one. If the new mask exists in the map, Union them.
    • Replace: Find a bit that is 1, turn it to 0, and turn a 0 to 1. If this new mask exists, Union them.
  4. Alternatively, the "Replace" operation is equivalent to deleting a character to reach an intermediate state, and then adding a different character. By connecting all words to their "deletion" masks, replacements are handled transitively.

Example explanation

Strings: ["a", "b", "ab", "cde"]

  1. Masks: a=1, b=2, ab=3, cde=28.
  2. For "ab" (mask 3):
    • Delete 'b': mask becomes 1 ("a"). "a" exists! Union("ab", "a").
    • Delete 'a': mask becomes 2 ("b"). "b" exists! Union("ab", "b").
  3. Now "a", "b", and "ab" are in the same group. "a" and "b" are connected via a replacement (transitively through "ab", or directly by checking replace masks).
  4. "cde" has no neighbors that exist in the set. Result: 2 groups. Max size is 3.

Common mistakes candidates make

  • O(N2)O(N^2) Comparison: Checking the edit distance between every pair of strings, causing a TLE.
  • Generating Replacements Naively: Looping through all 1s and all 0s for a replace operation is 26imes2626 imes 26 checks. It's much faster to use the "virtual intermediate node" trick (e.g., connect "abc" and "adc" both to the virtual node "ac").
  • String manipulation: Doing these operations on actual string objects instead of fast integer bitmasks.

Interview preparation tip

When strings contain unique lowercase letters, always think of 26-bit integers. It transforms O(L)O(L) string operations into O(1)O(1) bitwise operations (mask ^ (1 << i)).

Similar Questions