0Pricing
MongoDB Academy · Lesson

The Outlier and Tree Structure Patterns

Learners will handle documents with unusually large arrays using the outlier pattern and model hierarchical tree data with parent references or materialised paths.

The Outlier Problem

Most MongoDB collections have documents that follow a predictable size distribution. But occasionally you get outliers — documents that deviate dramatically from the norm. A social media post that goes viral might accumulate 50,000 comments while typical posts have 5–20. A product liked by a celebrity might have 10,000 reviews. Designing your schema around the average case while ignoring outliers leads to documents that eventually hit the 16 MB document size limit or cause memory pressure.

Detecting Outlier Documents

Before designing around outliers, identify whether they actually exist in your data. Use an aggregation pipeline to find documents with unusually large arrays. Set a threshold based on your expected normal range — if 99% of posts have fewer than 100 comments, documents with more than 1,000 comments are outliers worth handling specially.

// Find posts with outlier-level comment counts
db.posts.aggregate([
  {
    $project: {
      title: 1,
      commentCount: { $size: { $ifNull: ['$comments', []] } }
    }
  },
  { $match: { commentCount: { $gt: 1000 } } },
  { $sort: { commentCount: -1 } },
  { $limit: 10 }
])

All lessons in this course

  1. The Bucket and Computed Patterns
  2. The Extended Reference and Subset Patterns
  3. The Polymorphic and Schema Versioning Patterns
  4. The Outlier and Tree Structure Patterns
← Back to MongoDB Academy