MongoDB Academy · Lesson

Aggregation Pipeline Optimization Tips

Learners will restructure aggregation pipelines to push $match and $project early, avoid $unwind before $match, and use allowDiskUse for large sorts.

Aggregation Performance: The Big Picture

Aggregation pipelines can be expensive — they can scan millions of documents, build large in-memory structures, and block for seconds. The key to fast pipelines is applying data-reduction stages as early as possible so later stages work on the smallest possible dataset. MongoDB also has an internal optimizer that automatically reorders certain stages, but understanding manual optimizations gives you the most control.

Put $match Early: Filter Before You Transform

$match is MongoDB's filter stage. Placing it as early as possible in the pipeline reduces the number of documents that flow into every subsequent stage. If $match can use an index, it becomes an extremely fast first step. Even a $match without an index early in the pipeline is better than a late one — it avoids processing documents that will be discarded anyway.

// BAD: $group processes all 1M docs, then $match discards most
db.orders.aggregate([
  { $group: { _id: '$status', total: { $sum: '$amount' } } },
  { $match: { _id: 'pending' } }   // late match
])

// GOOD: $match first — only 'pending' docs enter $group
db.orders.aggregate([
  { $match: { status: 'pending' } },  // early match, uses index
  { $group: { _id: '$customerId', total: { $sum: '$amount' } } }
])

All lessons in this course

The Database Profiler and Slow Query Log
Compound Index Prefix Rule and ESR Principle
Index Intersection vs Compound Indexes
Aggregation Pipeline Optimization Tips

← Back to MongoDB Academy