Magicsheet logo

Most Common Word

Easy
91.6%
Updated 6/1/2025

Most Common Word

What is this problem about?

The Most Common Word problem gives you a paragraph of text and a list of banned words. Find the most frequently occurring word that is not in the banned list. The comparison is case-insensitive and punctuation should be stripped. This Most Common Word coding problem tests string cleaning, tokenization, and frequency counting.

Why is this asked in interviews?

Microsoft, Meta, Amazon, and Google ask this as a practical text processing problem that tests string manipulation discipline. It validates that candidates can handle real-world text inputs: removing punctuation, lowercasing, splitting into words, and filtering against a banned set. The array, hash table, counting, and string interview pattern is the core.

Algorithmic pattern used

String normalization + frequency map. Normalize the paragraph: convert to lowercase, replace non-alphabetic characters with spaces, then split into words. Build a frequency map. Convert banned words to a set for O(1) lookup. Iterate the frequency map to find the word with the highest frequency not in the banned set.

Example explanation

Paragraph: "Bob. hIt, Ball BOB Hit.", banned: ["bob", "hit"].

  • Normalize: "bob hit ball bob hit".
  • Split: ["bob", "hit", "ball", "bob", "hit"].
  • Frequencies: {bob:2, hit:2, ball:1}.
  • Filter banned: only "ball" remains.
  • Most common = "ball".

Common mistakes candidates make

  • Not stripping punctuation (treating "bob." as different from "bob").
  • Case sensitivity errors (not lowercasing before counting).
  • Using a list for banned words (O(n) lookup vs O(1) for a set).
  • Not handling multiple spaces or punctuation between words.

Interview preparation tip

Text processing problems always require a normalization step before any algorithmic work. The pattern: lowercase → strip non-alpha → split → count → filter. Regex is clean for normalization: re.sub(r'[^a-z\s]', '', paragraph.lower()). Practice writing string normalization code from memory — getting this step right quickly in interviews frees mental bandwidth for the algorithmic portion.

Similar Questions