Magicsheet logo

Subdomain Visit Count

Medium
47.6%
Updated 6/1/2025

Subdomain Visit Count

What is this problem about?

"Subdomain Visit Count" is a medium-difficulty string and data structure problem. You are given a list of "count-paired domains" where each entry consists of a visit count and a domain name (e.g., "9001 discuss.google.com"). You need to calculate the total number of visits for every subdomain mentioned. For "discuss.google.com," the subdomains are "discuss.google.com," "google.com," and "com." The output should be a list of the counts and their corresponding subdomains in any order.

Why is this asked in interviews?

Companies like Microsoft, Roblox, and Google use this question to evaluate a candidate's string parsing and hash map usage. It's a very practical problem that mimics real-world log processing. It tests whether you can correctly split strings, iterate through the different levels of a domain hierarchy, and aggregate data efficiently. It's also a good test of how you handle input strings with different numbers of segments.

Algorithmic pattern used

The pattern for this problem is String Parsing and Hash Table Aggregation. For each input string:

  1. Split the string to separate the count and the full domain.
  2. Iterate through the domain to find all possible subdomains. This is usually done by finding the indices of the dots ('.') and taking substrings from those points to the end of the string.
  3. For each subdomain, update its total count in a Hash Map. Finally, iterate through the Hash Map and format the results into the required output strings.

Example explanation (use your own example)

Input: ["50 info.example.org"]

  1. Parse count 50 and domain "info.example.org".
  2. Subdomains:
    • "info.example.org" -> Map["info.example.org"] += 50
    • "example.org" -> Map["example.org"] += 50
    • "org" -> Map["org"] += 50 If another input was "10 example.org", then Map["example.org"] would become 60 and Map["org"] would become 60.

Common mistakes candidates make

A common mistake is not correctly identifying all levels of the subdomain (e.g., forgetting the top-level domain like ".com"). Another mistake is inefficiently recreating strings in a loop, which can lead to higher time complexity in languages where strings are immutable. Using a complex recursive solution is also usually overkill for this problem; a simple iterative approach using split and substring is much cleaner.

Interview preparation tip

For the Subdomain Visit Count interview question, focus on writing clean and modular code. Mention how you handle the string splitting—using a built-in split function is usually fine. Be careful with space complexity; the number of entries in your hash map depends on the number of unique subdomains, which is a good point to discuss with your interviewer.

Similar Questions