Median from Data Stream and K-Way Merge
Maintain two heaps (max-heap of small half, min-heap of large half) for O(log n) median updates, and merge k sorted lists using a heap.
Median from Data Stream Problem
Find Median from Data Stream (LeetCode #295) asks you to support two operations efficiently: addNum(num) to add a number and findMedian() to return the current median. The median of an even-length list is the average of the two middle values. A brute-force sorted list gives O(n) insert and O(1) median. The optimal solution uses two heaps for O(log n) insert and O(1) median.
import heapq
# Strategy: maintain two halves of the data
# max_heap: lower half (stores negated values for max behavior)
# min_heap: upper half
# Invariant: len(max_heap) == len(min_heap) or len(max_heap) == len(min_heap) + 1
# Invariant: max(max_heap) <= min(min_heap)
# Median:
# odd count: max_heap[0] (top of lower half)
# even count: average of tops of both halves
print('Two-heap strategy for O(log n) insert, O(1) median')Two-Heap MedianFinder Implementation
Maintain a max-heap for the lower half and a min-heap for the upper half. Always ensure the max-heap has the same size or one more element than the min-heap. When adding a number: push to max-heap, then balance by moving the max-heap's top to the min-heap if the top exceeds the min-heap's minimum, and rebalance sizes if needed.
import heapq
class MedianFinder:
def __init__(self):
self.lo = [] # max-heap (negated) for lower half
self.hi = [] # min-heap for upper half
def addNum(self, num):
heapq.heappush(self.lo, -num) # push to lower half
# Ensure max of lower <= min of upper
if self.hi and -self.lo[0] > self.hi[0]:
heapq.heappush(self.hi, -heapq.heappop(self.lo))
# Balance sizes: lo can have at most 1 more than hi
if len(self.lo) > len(self.hi) + 1:
heapq.heappush(self.hi, -heapq.heappop(self.lo))
elif len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def findMedian(self):
if len(self.lo) > len(self.hi):
return -self.lo[0] # odd count: top of lower half
return (-self.lo[0] + self.hi[0]) / 2
mf = MedianFinder()
for n in [1, 2, 3, 4, 5]: mf.addNum(n)
print(mf.findMedian()) # 3.0All lessons in this course
- Heap Property and Array Representation
- Heapify, Push, and Pop from Scratch
- Python heapq and Max-Heap Tricks
- Median from Data Stream and K-Way Merge