PostgreSQL Performance & Query Optimization · Lesson

MCV and N-Distinct Corrections

Use ndistinct and most-common-value statistics to repair join and grouping estimates.

Why Estimates Drift

The PostgreSQL planner chooses join orders, join methods, and grouping strategies from row-count estimates. When those estimates are wrong, you get nested loops over millions of rows or a hash table sized for the wrong cardinality.

Two column-level statistics drive most of these estimates:

n_distinct — how many distinct values the planner believes a column holds. It feeds grouping and join cardinality.
most_common_vals (MCV) — the list of frequent values and their frequencies, used for selectivity of equality predicates.

This lesson shows how to read, diagnose, and correct both when the default sampling gets them wrong.

Reading pg_stats

Everything the planner knows about a column lives in the pg_stats view, a human-readable wrapper over pg_statistic. Start every diagnosis here.

Key columns: n_distinct, most_common_vals, most_common_freqs, and null_frac.

SELECT attname,
       n_distinct,
       null_frac,
       most_common_vals,
       most_common_freqs
FROM pg_stats
WHERE schemaname = 'public'
  AND tablename = 'orders'
  AND attname IN ('customer_id', 'status');

All lessons in this course

How the Planner Estimates Row Counts
Multivariate Statistics for Correlated Columns
MCV and N-Distinct Corrections
Validating Estimates Against Actual Rows

← Back to PostgreSQL Performance & Query Optimization