Monitoring Replication Lag and Slot Bloat
Track WAL retention on replication slots to prevent runaway disk usage on the primary.
Why Slots Can Eat Your Disk
Logical replication slots make subscribers durable: the primary will retain WAL until every active slot has confirmed it consumed the changes. That guarantee is the same mechanism that can fill your data disk.
- A slot that is inactive (consumer down, network split, slow apply) pins WAL forever.
- WAL accumulates in
pg_wal/, the partition fills, and the primary can stop accepting writes.
This lesson is about observing that retention early, before it becomes an outage. The core question is always: how far behind is each slot, in bytes?
The Authoritative View: pg_replication_slots
Every slot is visible in pg_replication_slots. The columns that matter for bloat are active, restart_lsn, and on PostgreSQL 13+ the retention bookkeeping columns wal_status and safe_wal_size.
restart_lsn— the oldest LSN the slot still needs; WAL before it can be recycled.active— whether a consumer is currently connected.wal_status—reserved,extended,unreserved, orlost.
SELECT slot_name,
slot_type,
active,
restart_lsn,
wal_status,
safe_wal_size
FROM pg_replication_slots
ORDER BY active, slot_name;All lessons in this course
- Publications, Subscriptions, and Replica Identity
- Offloading Read and Analytic Workloads
- Near-Zero-Downtime Major Version Upgrades
- Monitoring Replication Lag and Slot Bloat