Measuring Table and Index Bloat Accurately
Use pgstattuple and estimation queries to quantify dead space before deciding on a remediation path.
Why Bloat Happens
PostgreSQL uses MVCC (Multi-Version Concurrency Control). When you UPDATE or DELETE a row, the old version is not erased immediately. It becomes a dead tuple that still occupies space until VACUUM marks it reusable.
- Bloat = space occupied by dead tuples plus unfilled free space that the table or index no longer needs.
- Bloat inflates on-disk size, slows sequential scans, and reduces cache efficiency.
- Indexes bloat too: B-tree pages keep pointers to dead heap tuples until cleaned.
Before choosing a fix (VACUUM, VACUUM FULL, pg_repack, or REINDEX), you must first measure how much bloat actually exists. Guessing leads to unnecessary, disruptive maintenance.
Live vs Dead Tuples
The cheapest first signal comes from the statistics collector. pg_stat_user_tables tracks an estimate of live and dead tuples per table, updated by ANALYZE and autovacuum.
n_live_tup— estimated live rows.n_dead_tup— estimated dead rows awaiting cleanup.- A high
n_dead_tupratio suggests autovacuum is falling behind.
This is an estimate, not a byte-accurate measure, but it costs nothing and is a great triage filter.
SELECT relname,
n_live_tup,
n_dead_tup,
round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;All lessons in this course
- Measuring Table and Index Bloat Accurately
- Reclaiming Space with pg_repack
- Tuning Fillfactor for Update-Heavy Tables
- TOAST Internals and Large Value Storage