Magicsheet logo

Unique Email Addresses

Easy
69.5%
Updated 6/1/2025

Unique Email Addresses

What is this problem about?

The Unique Email Addresses interview question is a classic string processing challenge that mimics real-world data cleaning. You are given a list of email addresses. Each address consists of a local name and a domain name, separated by an '@'. The problem introduces two specific rules for local names: periods ('.') are ignored, and everything after a plus sign ('+') is ignored. Your task is to find the number of unique "actual" email addresses that receive mail.

Why is this asked in interviews?

Companies like Google and Intuit use the Unique Email Addresses coding problem to assess a candidate's ability to handle string manipulation and rule-based logic. It’s a practical problem that tests whether you can accurately follow complex specifications and use appropriate data structures to handle duplicates. It also evaluates your ability to separate the logic for the local name and the domain name correctly.

Algorithmic pattern used

The most effective Array, Hash Table, String interview pattern for this problem involves iterating through each email and "normalizing" the local name. For each email, you split it into the local and domain parts. You then process the local part: remove all periods and truncate everything from the first plus sign onwards. Finally, you rejoin the normalized local part with the original domain name and add the result to a Set (Hash Set). The size of the Set at the end is the number of unique addresses.

Example explanation

Suppose you have the email: test.email+alex@gmail.com.

  1. Split into local (test.email+alex) and domain (gmail.com).
  2. Ignore the +alex: Local becomes test.email.
  3. Remove the .: Local becomes testemail.
  4. Rejoin: testemail@gmail.com. If you also had testemail@gmail.com in the list, both would map to the same entry in your Set, counting as only one unique address.

Common mistakes candidates make

A very common mistake is applying the period or plus sign rules to the domain name. The problem specifically states these rules only apply to the local name. Another error is using string concatenation in a loop without considering performance (in some languages, using a list of characters or a string builder is more efficient). Finally, forgetting to handle multiple plus signs or multiple periods can lead to incorrect normalization.

Interview preparation tip

When a problem involves identifying unique items based on specific rules, the Hash Set is almost always the best tool. Focus on the transformation logic—how to take a raw input and turn it into a canonical form. This "normalization" step is a common pattern in many data-driven interview questions.

Similar Questions