Magicsheet logo

Sentence Similarity

Easy
25%
Updated 8/1/2025

Asked by 1 Company

Sentence Similarity

What is this problem about?

The Sentence Similarity interview question gives you two sentences as arrays of words, and a list of similar word pairs. Two sentences are similar if they have the same length and each corresponding word pair is either identical or in the similar-pairs list (similarity is not transitive — only direct pairs count). Determine whether the two sentences are similar.

Why is this asked in interviews?

Google asks this problem because it tests hash set lookups combined with pairwise comparison logic. The non-transitivity constraint — similar(A, B) and similar(B, C) does NOT imply similar(A, C) — is a subtle rule that distinguishes this from the harder Sentence Similarity II, which uses Union-Find for transitive similarity. Interviewers test whether candidates understand the problem boundaries before choosing a data structure.

Algorithmic pattern used

The pattern is hash set with pairwise validation. Build a hash set of similar pairs (store each pair as a frozenset or as both (w1, w2) and (w2, w1) in the set for symmetric lookup). Then check that the two sentences have equal length. For each index i, verify that words1[i] == words2[i] or (words1[i], words2[i]) is in the similar-pairs set. If any position fails, return false. Otherwise return true.

Example explanation

sentence1: ["great", "acting", "skills"] sentence2: ["fine", "drama", "talent"] pairs: [["great", "fine"], ["drama", "acting"], ["skills", "talent"]]

Build set: {("great","fine"), ("fine","great"), ("drama","acting"), ("acting","drama"), ("skills","talent"), ("talent","skills")}.

  • "great" vs "fine": in set ✓
  • "acting" vs "drama": in set ✓
  • "skills" vs "talent": in set ✓

Return true.

Common mistakes candidates make

  • Not storing pairs in both directions — ("great", "fine") and ("fine", "great") must both be in the set for symmetric lookup.
  • Applying transitivity — if "great""fine" and "fine""decent", that does NOT make "great"~"decent" in this version.
  • Not checking sentence lengths first — different lengths immediately return false.
  • Using a list search instead of a hash set — O(n × m) is acceptable only for small inputs; hash set gives O(1) per lookup.

Interview preparation tip

For the Sentence Similarity coding problem, the hash table string interview pattern with symmetric pair storage is the approach. The key distinction from Sentence Similarity II is transitivity — know when to use a set (no transitivity) versus Union-Find (with transitivity). Google interviewers often ask both versions back-to-back, so be ready to explain this architectural difference. In Python, storing pairs as frozenset naturally handles symmetry without duplicating entries.

Similar Questions