0PricingLogin
AI Agents · Lesson

Why Long Contexts Don't Scale

Cost grows linearly, quality degrades, and 'lost-in-the-middle' makes the model forget content buried in long prompts.

Bigger Is Not Always Better

Modern models have huge context windows — 200k, 1M, even 2M tokens. It is tempting to "just dump everything in" instead of building real memory.

That approach fails in production for three reasons.

Reason 1: Cost

Every token in every call costs money. A 100k-token system prompt with 1000 turns = 100M tokens of input, totalling tens of dollars per session.

Prompt caching helps but does not eliminate the cost.

All lessons in this course

  1. Short-Term Memory in the Context Window
  2. Why Long Contexts Don't Scale
  3. Summarisation as Compression
  4. Simple Memory Stores (Key-Value)
← Back to AI Agents