Magicsheet logo

Finding the Topic of Each Post

Hard
25%
Updated 8/1/2025

Asked by 1 Company

Topics

Finding the Topic of Each Post

1. What is this problem about?

The Finding the Topic of Each Post interview question is a structured data challenge that focuses on keyword-based classification within a database context. You are typically presented with two tables: one containing social media posts (IDs and text content) and another containing a dictionary of keywords mapped to specific topics. Your goal is to identify which topics apply to each post based on whether any of the topic's keywords appear in the post's text.

2. Why is this asked in interviews?

Meta and other data-driven companies use the Finding the Topic of Each Post coding problem to assess a candidate's ability to handle complex string matching and data aggregation in SQL. It evaluates your understanding of non-equi joins, pattern matching (using LIKE or regex), and the ability to group and format results into a single comma-separated string, which is a common requirement for generating reports.

3. Algorithmic pattern used

This problem relies on the Database interview pattern of Pattern Match Joining and String Aggregation.

  1. Join with LIKE: You join the Posts table with the Keywords table using a condition where the post content contains the keyword as a whole word.
  2. Filtering: You must ensure that you match whole words only (e.g., "sea" should not match "search").
  3. Grouping: Results are grouped by post ID.
  4. Aggregation: Use a function like GROUP_CONCAT (MySQL) or STRING_AGG (PostgreSQL) to combine all unique topics found for a single post into a sorted, comma-separated list.

4. Example explanation

Suppose we have a post: "The new smartphone has a great camera." And our keywords are:

  • Topic: Tech, Keywords: "smartphone", "laptop"
  • Topic: Gadgets, Keywords: "camera", "phone"
  • Topic: Food, Keywords: "apple", "bread"

The post contains "smartphone" (Tech) and "camera" (Gadgets). After joining and aggregating, the output for this post ID would be: "Gadgets,Tech". If a post matches no keywords, it is often labeled as "No Topic".

5. Common mistakes candidates make

  • Substring Mis-matching: Matching keywords that are part of larger words. For example, the keyword "art" might accidentally match the word "smart" if boundaries aren't handled correctly.
  • Duplicate Topics: If a post contains both "smartphone" and "laptop", it might list the topic "Tech" twice unless DISTINCT is used within the aggregation function.
  • Case Sensitivity: Forgetting that "Camera" and "camera" should be treated as the same word, requiring the use of LOWER() or case-insensitive matching.

6. Interview preparation tip

When working with Database interview patterns, practice your "joining logic" beyond just simple ID matches. Knowing how to use LIKE with wildcards (%) and understanding vendor-specific aggregation functions like GROUP_CONCAT will make you stand out in SQL-heavy interviews. Always clarify how the interviewer wants you to handle punctuation and word boundaries.

Similar Questions