COPY vs Multi-Row INSERT Throughput
Benchmark and choose ingestion methods that maximize rows per second under real constraints.
Why Ingestion Speed Matters
When you load millions of rows into PostgreSQL, the method you choose determines whether the job takes seconds or hours. This lesson benchmarks the two main ingestion paths: COPY and multi-row INSERT.
- COPY streams rows through a single, optimized bulk path.
- Multi-row INSERT packs many tuples into one statement to amortize round-trips.
The right choice depends on data source, round-trip cost, and how the rows arrive.
The Naive Baseline: Single-Row INSERT
The slowest pattern is one row per statement. Each statement pays for parsing, planning, a network round-trip, and (without batching) a separate commit.
At thousands of rows, the per-statement overhead dominates and throughput collapses. This is the baseline every other method beats.
-- Slow: one round-trip and (by default) one commit per row
INSERT INTO events (user_id, kind, payload) VALUES (1, 'click', '{}');
INSERT INTO events (user_id, kind, payload) VALUES (2, 'view', '{}');
INSERT INTO events (user_id, kind, payload) VALUES (3, 'click', '{}');
-- ... repeated 1,000,000 timesAll lessons in this course
- COPY vs Multi-Row INSERT Throughput
- Deferring Indexes and Constraints During Load
- Tuning WAL and Checkpoints for Ingestion
- Upserts at Scale with ON CONFLICT