Finding Gaps in a Sequence
Detecting missing values and the start/end of each gap.
Now Hunt The Gaps
So far we have grouped rows into islands. The mirror-image interview question is: which values are missing? Interviewers phrase it as "find the gaps in this ID sequence," "which invoice numbers were skipped," or "on which days was there no activity."
Gaps are the empty spaces between islands. The key realization is that you usually do not need to list every single missing value; you need to report the start and end of each gap range, which is far more compact and is what interviewers expect.
The Sample Gap Dataset
Reuse the present values 1, 2, 3, 7, 8, 10 from a table seq(n). The gaps to report are:
- From 4 to 6 (after the first island, before 7)
- From 9 to 9 (between 8 and 10)
Notice we describe a gap as a range: gap_start = last present value + 1, gap_end = next present value - 1. That compact form is the goal of the core technique below.
CREATE TABLE seq (n INT);
INSERT INTO seq VALUES (1),(2),(3),(7),(8),(10);