Magicsheet logo

Find Median from Data Stream

Hard
55.9%
Updated 6/1/2025

Find Median from Data Stream

What is this problem about?

The Find Median from Data Stream interview question asks you to design a data structure that supports two operations:

  1. addNum(num): Adds an integer from the data stream to the collection.
  2. findMedian(): Returns the median of all elements seen so far. The median is the middle value in a sorted list. If the list has an even number of elements, the median is the average of the two middle values.

Why is this asked in interviews?

This is one of the most famous "Hard" design questions, asked by almost every big tech company (Apple, Uber, Amazon, Google). It tests your ability to maintain sorted order in a dynamic dataset. A naive solution (sorting every time) is O(n2)O(n^2). The challenge is to achieve O(logn)O(\log n) for adding and O(1)O(1) for finding the median using the Heap interview pattern.

Algorithmic pattern used

The problem is solved using Two Heaps (Dual Heap).

  1. Max-Heap: Stores the smaller half of the numbers.
  2. Min-Heap: Stores the larger half of the numbers.
  3. Balancing: Ensure that the Max-Heap size is equal to or one greater than the Min-Heap size.
  4. Addition Logic:
    • Add to Max-Heap.
    • Move the largest of the Max-Heap to the Min-Heap.
    • If Min-Heap becomes larger, move its smallest back to Max-Heap.
  5. Median: If total elements are odd, it's the top of Max-Heap. If even, it's the average of both tops.

Example explanation

Stream: 1, 3, 2.

  1. Add 1: MaxHeap: [1], MinHeap: []. Median = 1.
  2. Add 3: MaxHeap: [1], MinHeap: [3]. Median = (1+3)/2 = 2.
  3. Add 2: MaxHeap: [1, 2], MinHeap: [3]. Median = 2. This approach keeps the "center" of the data always at the tops of the heaps.

Common mistakes candidates make

  • Inefficient Sorting: Calling sort() on a list for every new number (O(n2)O(n^2)).
  • Incorrect Heap type: Mixing up which heap is for the smaller half and which is for the larger.
  • Odd/Even logic: Forgetting to handle the average for even-sized streams.

Interview preparation tip

The "Two Heaps" pattern is the standard solution for any problem involving a "sliding" or "dynamic" median. Practice explaining the balancing step, as it's the most critical part of the logic.

Similar Questions