0PricingLogin
SQL Interview Prep · Lesson

Building a Retention Matrix

Counting active users by cohort and period offset to form a retention table.

What a Retention Matrix Is

The follow-up to defining a cohort is the famous retention matrix: rows are cohorts, columns are period offsets (month 0, 1, 2, ...), and each cell counts how many of that cohort were still active at that offset.

Interviewers love this because it forces you to combine cohort assignment, a join back to activity, a period-difference calculation, and a pivot. It is the single most representative query of product analytics.

The Two Inputs

You need two things: each user's cohort period (from the previous lesson) and a record of every active period per user. Activity comes from the same events table, collapsed to the period grain.

So plan the query as: cohort CTE, then an activity CTE that lists which months each user was active, then join them.

WITH user_cohort AS (
  SELECT user_id,
    DATE_TRUNC('month', MIN(event_at)) AS cohort_month
  FROM events
  GROUP BY user_id
)
SELECT * FROM user_cohort;

All lessons in this course

  1. Defining a Cohort by First Action
  2. Building a Retention Matrix
  3. Day-N and Rolling Retention
  4. Churn and Resurrection Queries
← Back to SQL Interview Prep