AI Agents · Lesson

Model Routing (Cheap -> Expensive)

Try a small model first, escalate to a frontier model only when the small one fails or low-confidence.

Don't Use Big Models For Easy Tasks

Calling GPT-4o for "is this email spam?" wastes money. Route easy tasks to cheap models, hard tasks to expensive ones.

The Routing Pattern

Try small/cheap model first
If output is low-confidence or fails validation, escalate to big model
Log which path was taken

All lessons in this course

Token Budgets Per Step
Model Routing (Cheap -> Expensive)
Caching Prompts and Results (Anthropic, Vertex)
Quantisation and Speculative Decoding

← Back to AI Agents