0PricingLogin
PostgreSQL Performance & Query Optimization · Lesson

Measuring Table and Index Bloat Accurately

Use pgstattuple and estimation queries to quantify dead space before deciding on a remediation path.

Why Bloat Happens

PostgreSQL uses MVCC (Multi-Version Concurrency Control). When you UPDATE or DELETE a row, the old version is not erased immediately. It becomes a dead tuple that still occupies space until VACUUM marks it reusable.

  • Bloat = space occupied by dead tuples plus unfilled free space that the table or index no longer needs.
  • Bloat inflates on-disk size, slows sequential scans, and reduces cache efficiency.
  • Indexes bloat too: B-tree pages keep pointers to dead heap tuples until cleaned.

Before choosing a fix (VACUUM, VACUUM FULL, pg_repack, or REINDEX), you must first measure how much bloat actually exists. Guessing leads to unnecessary, disruptive maintenance.

Live vs Dead Tuples

The cheapest first signal comes from the statistics collector. pg_stat_user_tables tracks an estimate of live and dead tuples per table, updated by ANALYZE and autovacuum.

  • n_live_tup — estimated live rows.
  • n_dead_tup — estimated dead rows awaiting cleanup.
  • A high n_dead_tup ratio suggests autovacuum is falling behind.

This is an estimate, not a byte-accurate measure, but it costs nothing and is a great triage filter.

SELECT relname,
       n_live_tup,
       n_dead_tup,
       round(n_dead_tup * 100.0 / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
       last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;

All lessons in this course

  1. Measuring Table and Index Bloat Accurately
  2. Reclaiming Space with pg_repack
  3. Tuning Fillfactor for Update-Heavy Tables
  4. TOAST Internals and Large Value Storage
← Back to PostgreSQL Performance & Query Optimization