Magicsheet logo

Number of Transactions per Visit

Hard
100%
Updated 6/1/2025

Asked by 2 Companies

Topics

Number of Transactions per Visit

What is this problem about?

The Number of Transactions per Visit is a hard SQL problem where you report, for each possible number of transactions (0, 1, 2, ...), how many users visited the website exactly that many times without making any transaction. It involves counting visits, joining with transaction data, and generating a sequence of numbers for the full distribution. This requires generating rows programmatically in SQL.

Why is this asked in interviews?

Machine Zone asks this hard SQL problem because it requires generating a complete distribution (0 transactions, 1 transaction, 2 transactions, up to max) even for counts with no users. This tests window functions, recursive CTEs or number generation, LEFT JOIN for zero-count rows, and aggregation. The database interview pattern at an advanced level is demonstrated.

Algorithmic pattern used

Number generation + LEFT JOIN + aggregation. First, compute visits without transactions per user (LEFT JOIN visits with transactions, count where transaction_id IS NULL). Then generate all possible transaction counts (0 to max) using a recursive CTE or a generated numbers table. LEFT JOIN the user counts with the generated numbers. Group and count.

Example explanation

Visits: User 1 visited 3 times (1 without transaction, 2 with). User 2 visited 1 time (0 with transaction). Distribution: 0 transactions: 1 user. 1 transaction: 0 users. 2 transactions: 1 user. Output includes all rows from 0 to max, even those with count=0.

Common mistakes candidates make

  • Missing users with 0 transactions (must use LEFT JOIN, not INNER JOIN).
  • Not generating the full range of transaction counts (output requires all values 0..max).
  • Incorrect GROUP BY when joining the generated sequence.
  • Using COUNT(*) instead of COUNT(DISTINCT user_id).

Interview preparation tip

Hard SQL problems often require generating number sequences. In MySQL: use WITH RECURSIVE cte AS (SELECT 0 AS n UNION ALL SELECT n+1 FROM cte WHERE n < max_count). In PostgreSQL: generate_series(0, max_count). Practice combining recursive CTEs with LEFT JOIN for "fill in missing values in a distribution" problems — this is a standard analytics SQL pattern for reporting full distributions.

Similar Questions