0Pricing
DSA Interview Prep · Lesson

Median from Data Stream and K-Way Merge

Maintain two heaps (max-heap of small half, min-heap of large half) for O(log n) median updates, and merge k sorted lists using a heap.

Median from Data Stream Problem

Find Median from Data Stream (LeetCode #295) asks you to support two operations efficiently: addNum(num) to add a number and findMedian() to return the current median. The median of an even-length list is the average of the two middle values. A brute-force sorted list gives O(n) insert and O(1) median. The optimal solution uses two heaps for O(log n) insert and O(1) median.

import heapq

# Strategy: maintain two halves of the data
# max_heap: lower half (stores negated values for max behavior)
# min_heap: upper half
# Invariant: len(max_heap) == len(min_heap) or len(max_heap) == len(min_heap) + 1
# Invariant: max(max_heap) <= min(min_heap)
# Median:
#   odd count:  max_heap[0] (top of lower half)
#   even count: average of tops of both halves
print('Two-heap strategy for O(log n) insert, O(1) median')

Two-Heap MedianFinder Implementation

Maintain a max-heap for the lower half and a min-heap for the upper half. Always ensure the max-heap has the same size or one more element than the min-heap. When adding a number: push to max-heap, then balance by moving the max-heap's top to the min-heap if the top exceeds the min-heap's minimum, and rebalance sizes if needed.

import heapq

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negated) for lower half
        self.hi = []  # min-heap for upper half

    def addNum(self, num):
        heapq.heappush(self.lo, -num)   # push to lower half
        # Ensure max of lower <= min of upper
        if self.hi and -self.lo[0] > self.hi[0]:
            heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # Balance sizes: lo can have at most 1 more than hi
        if len(self.lo) > len(self.hi) + 1:
            heapq.heappush(self.hi, -heapq.heappop(self.lo))
        elif len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def findMedian(self):
        if len(self.lo) > len(self.hi):
            return -self.lo[0]  # odd count: top of lower half
        return (-self.lo[0] + self.hi[0]) / 2

mf = MedianFinder()
for n in [1, 2, 3, 4, 5]: mf.addNum(n)
print(mf.findMedian())  # 3.0

All lessons in this course

  1. Heap Property and Array Representation
  2. Heapify, Push, and Pop from Scratch
  3. Python heapq and Max-Heap Tricks
  4. Median from Data Stream and K-Way Merge
← Back to DSA Interview Prep