MongoDB Academy · Lesson

Pipeline Concepts: Stages, Operators, and Expressions

Learners will understand how data flows through pipeline stages and distinguish between stage operators and expression operators.

What Is the Aggregation Pipeline?

The aggregation pipeline is MongoDB's server-side data transformation engine. Instead of retrieving raw documents and processing them in application code, you define a sequence of stages that transform the data stream step by step—filtering, reshaping, grouping, computing, and sorting—all within the database engine. This keeps heavy computation close to the data and dramatically reduces what you send over the network.

// A simple aggregation pipeline
db.orders.aggregate([
  { $match: { status: 'completed' } },   // stage 1: filter
  { $group: { _id: '$userId', total: { $sum: '$amount' } } },  // stage 2: group
  { $sort: { total: -1 } },             // stage 3: sort
  { $limit: 10 }                        // stage 4: take top 10
]);

How Documents Flow Through Stages

Think of the pipeline as a conveyor belt: each document enters stage 1, is transformed (or filtered out), and the results flow into stage 2, then stage 3, and so on. At each stage, the set of documents can shrink (filtering), expand (unwind), or be completely replaced by computed summaries (group). Documents that fail a filter condition are dropped from the pipeline and never reach later stages.

// Data flow visualization:
// Input: 10,000 orders
// Stage 1 ($match: status='completed'):  8,000 docs
// Stage 2 ($group by userId):            1,200 docs (one per user)
// Stage 3 ($sort by total DESC):         1,200 docs (reordered)
// Stage 4 ($limit 10):                      10 docs
// Output to client: 10 docs

All lessons in this course

Pipeline Concepts: Stages, Operators, and Expressions
$match and $project: Filter and Reshape
$group: Aggregating and Computing Totals
$sort, $limit, and $skip in the Pipeline

← Back to MongoDB Academy