The Sentence Similarity interview question gives you two sentences as arrays of words, and a list of similar word pairs. Two sentences are similar if they have the same length and each corresponding word pair is either identical or in the similar-pairs list (similarity is not transitive — only direct pairs count). Determine whether the two sentences are similar.
Google asks this problem because it tests hash set lookups combined with pairwise comparison logic. The non-transitivity constraint — similar(A, B) and similar(B, C) does NOT imply similar(A, C) — is a subtle rule that distinguishes this from the harder Sentence Similarity II, which uses Union-Find for transitive similarity. Interviewers test whether candidates understand the problem boundaries before choosing a data structure.
The pattern is hash set with pairwise validation. Build a hash set of similar pairs (store each pair as a frozenset or as both (w1, w2) and (w2, w1) in the set for symmetric lookup). Then check that the two sentences have equal length. For each index i, verify that words1[i] == words2[i] or (words1[i], words2[i]) is in the similar-pairs set. If any position fails, return false. Otherwise return true.
sentence1: ["great", "acting", "skills"]
sentence2: ["fine", "drama", "talent"]
pairs: [["great", "fine"], ["drama", "acting"], ["skills", "talent"]]
Build set: {("great","fine"), ("fine","great"), ("drama","acting"), ("acting","drama"), ("skills","talent"), ("talent","skills")}.
Return true.
("great", "fine") and ("fine", "great") must both be in the set for symmetric lookup.For the Sentence Similarity coding problem, the hash table string interview pattern with symmetric pair storage is the approach. The key distinction from Sentence Similarity II is transitivity — know when to use a set (no transitivity) versus Union-Find (with transitivity). Google interviewers often ask both versions back-to-back, so be ready to explain this architectural difference. In Python, storing pairs as frozenset naturally handles symmetry without duplicating entries.