Magicsheet logo

Lexicographically Smallest Equivalent String

Medium
65.2%
Updated 6/1/2025

Lexicographically Smallest Equivalent String

What is this problem about?

The "Lexicographically Smallest Equivalent String interview question" involves character equivalence. You are given two strings of equal length where characters at the same position are considered "equivalent." This equivalence is transitive (if a=b and b=c, then a=c). You are then given a third string and must transform it into its lexicographically smallest equivalent version. This "Lexicographically Smallest Equivalent String coding problem" is a classic application of grouping equivalent items.

Why is this asked in interviews?

This problem is a standard test for the "Union Find interview pattern". Companies like Google and Bloomberg use it to see if a candidate can model equivalence relations using a Disjoint Set Union (DSU) structure. It evaluates your ability to handle transitive properties and find a "representative" (the smallest character) for each group.

Algorithmic pattern used

The DSU (Union-Find) pattern is perfect here. For each pair of characters in the first two strings, you perform a union operation. Crucially, when joining two sets, you always make the smaller character the parent. When transforming the target string, you find the root of each character's set; since you always picked the smallest character as the parent, the root will be the lexicographically smallest equivalent character.

Example explanation

s1="parker", s2="morris"

  1. 'p' == 'm', 'a' == 'o', 'r' == 'r', 'k' == 'r', 'e' == 'i', 'r' == 's'.
  2. Groups: {p, m}, {a, o}, {k, r, s}.
  3. Smallest in groups: m<p so 'm' is root. a<o so 'a' is root. k<r<s so 'k' is root.
  4. Target "parser" -> 'p' becomes 'm', 'a' becomes 'a', 'r' becomes 'k', 's' becomes 'k', 'e' becomes 'e', 'r' becomes 'k'. Result: "makkek".

Common mistakes candidates make

  • Not handling transitivity: Only considering direct equalities from the input and missing the chains (e.g., a=b, b=c implies a=c).
  • Inefficient search: Not using Union-Find and instead trying to build an adjacency list and running BFS/DFS for every character.
  • Incorrect root selection: Not ensuring that the root of each set is always the smallest character.

Interview preparation tip

Union-Find is the go-to data structure for any problem involving "equivalence," "connectivity," or "grouping." Make sure you can implement find and union with path compression and union by rank/size quickly.

Similar Questions