Summarisation as Compression
Periodically replace old turns with a running summary, trading exact recall for context window space.
Compression as Memory
Once a conversation gets long, replace old turns with a short summary. You trade exact recall for context space.
The summary captures the gist (facts, decisions, pending items) — verbatim text is gone.
When to Summarise
Trigger summarisation when:
- Total tokens > threshold (e.g. 8k)
- Turn count > N (e.g. 20)
- User starts a new topic
All lessons in this course
- Short-Term Memory in the Context Window
- Why Long Contexts Don't Scale
- Summarisation as Compression
- Simple Memory Stores (Key-Value)