Cohort Retention Table
Build a cohort retention matrix showing the percentage of week-0 users still active in subsequent weeks using pivot_table.
What Is a Cohort Retention Table?
A cohort retention table groups users by the week (or month) they first used the product and then tracks what percentage of each cohort is still active in subsequent weeks. Row 0 always shows 100 % because the definition of week-0 is the signup week. Values in later columns reveal how quickly the product loses users — the slower the decay, the stronger the retention.
import pandas as pd
df = pd.read_csv('user_events.csv', parse_dates=['event_date'])
print(df.dtypes)
print(df.head())Assigning Each User a Cohort Week
The cohort week is the week of each user's first event. Use groupby('user_id')['event_date'].min() to find the first event date per user, then convert to an ISO week with .dt.to_period('W'). Merge this cohort label back to the main events DataFrame so every row knows which cohort week its user belongs to.
cohort_week = df.groupby('user_id')['event_date'].min().dt.to_period('W').rename('cohort_week')
df = df.join(cohort_week, on='user_id')
df['event_week'] = df['event_date'].dt.to_period('W')
print(df[['user_id', 'event_date', 'cohort_week', 'event_week']].head())All lessons in this course
- Sessionisation and Event Sequencing
- Funnel Analysis
- Cohort Retention Table
- Visualising User Journeys