Magicsheet logo

Game Play Analysis II

Easy
100%
Updated 6/1/2025

Asked by 2 Companies

Topics

Game Play Analysis II

What is this problem about?

The Game Play Analysis II coding problem builds on the first one. Given the same Activity table, you now need to report the device_id that each player used on their very first login. This is slightly trickier because you can't just use MIN(device_id)—you need the specific device associated with the MIN(event_date).

Why is this asked in interviews?

Amazon asks this Database interview pattern to test your ability to retrieve entire rows based on an aggregated subquery or using Window Functions. It separates beginners who only know basic GROUP BY from intermediate SQL users who can link an aggregated result back to the original dataset.

Algorithmic pattern used

There are two main ways to solve this:

  1. Subquery with IN / JOIN:
    • Create a subquery that finds the player_id and MIN(event_date) for each player (exactly like Game Play Analysis I).
    • Join this subquery back to the original Activity table on both player_id and event_date to retrieve the corresponding device_id.
  2. Window Functions (Preferred):
    • Use FIRST_VALUE(device_id) OVER(PARTITION BY player_id ORDER BY event_date) to directly grab the first device.
    • Or use ROW_NUMBER() OVER(PARTITION BY player_id ORDER BY event_date) as rn and filter for rn = 1.

Example explanation

Activity:

player_iddevice_idevent_date
122016-03-01
132016-05-02

Using ROW_NUMBER:

  1. Partition by player 1, order by date.
  2. Row (1, 2, 2016-03-01) gets rn = 1.
  3. Row (1, 3, 2016-05-02) gets rn = 2.
  4. Outer query filters WHERE rn = 1, resulting in player_id = 1, device_id = 2.

Common mistakes candidates make

  • Misusing MIN: Writing SELECT player_id, MIN(event_date), device_id GROUP BY player_id. This is invalid SQL in most databases because device_id is not aggregated. In MySQL (without strict mode), it might return a random device_id, which is incorrect.
  • Subquery performance: If using the subquery IN approach, it can be slow on very large tables compared to Window Functions.

Interview preparation tip

Window functions like ROW_NUMBER() are the modern, clean way to solve "find the row holding the min/max value" problems. Practice the PARTITION BY and ORDER BY syntax within the OVER() clause.

Similar Questions