Magicsheet logo

Alien Dictionary

Hard
80.4%
Updated 6/1/2025

Alien Dictionary

What is this problem about?

The "Alien Dictionary interview question" is a classic hard-level graph problem. You are given a list of strings sorted lexicographically according to the rules of a new, unknown language. Your task is to derive the order of characters in this "alien" alphabet. If the given list of words is inconsistent (meaning no such ordering exists) or if the information is insufficient, you must handle those cases accordingly. This problem is essentially asking you to reconstruct a sequence based on partial ordering rules.

Why is this asked in interviews?

Tech giants like Apple, Google, and Meta frequently use the "Alien Dictionary coding problem" because it covers multiple complex topics: graph construction, edge cases, and ordering algorithms. It tests whether a candidate can transform an abstract problem into a standard graph representation and then apply a well-known algorithm to solve it. It also requires careful handling of strings and character mapping.

Algorithmic pattern used

The core of this problem is the Topological Sort pattern within a Directed Graph.

  1. Graph Construction: Compare adjacent words to find the first differing character. This difference establishes a directed edge (e.g., if "w" comes before "e", then w -> e).
  2. Dependency Management: Use an adjacency list to store these edges and an in-degree array to track how many characters must come before each specific character.
  3. Ordering: Apply Kahn's algorithm (using a Queue and BFS) or a DFS-based topological sort to find a valid sequence. If the resulting sequence length is less than the number of unique characters, a cycle exists, and no valid order is possible.

Example explanation

Input: ["z", "x", "z"]

  1. Compare "z" and "x": z comes before x. (Edge: z -> x)
  2. Compare "x" and "z": x comes before z. (Edge: x -> z)
  3. Result: We have a cycle z -> x -> z. This is impossible in a sorted dictionary, so we return an empty string. Input: ["wrt", "wrf", "er", "ett", "rftt"]
  4. wrt vs wrf -> t -> f
  5. wrf vs er -> w -> e
  6. er vs ett -> r -> t
  7. ett vs rftt -> e -> r Order: w -> e -> r -> t -> f

Common mistakes candidates make

  • Failing to detect cycles: If the language has a circular dependency (A before B, B before A), the candidate must return an empty string.
  • Prefix Edge Case: If a longer word appears before its prefix (e.g., ["apple", "app"]), this is an invalid ordering, and the algorithm should return an empty string.
  • Ignoring characters with no dependencies: Every unique character in the input must be included in the final output, even if it doesn't have a specific "before" or "after" relationship.

Interview preparation tip

Master Topological Sort! It is the go-to algorithm for any problem involving dependencies or ordering. Practice building adjacency lists from string comparisons and always dry-run your code with a small example that includes a cycle to ensure your logic catches it.

Similar Questions