0Pricing
AI Agents · Lesson

Model Routing (Cheap -> Expensive)

Try a small model first, escalate to a frontier model only when the small one fails or low-confidence.

Don't Use Big Models For Easy Tasks

Calling GPT-4o for "is this email spam?" wastes money. Route easy tasks to cheap models, hard tasks to expensive ones.

The Routing Pattern

  1. Try small/cheap model first
  2. If output is low-confidence or fails validation, escalate to big model
  3. Log which path was taken

All lessons in this course

  1. Token Budgets Per Step
  2. Model Routing (Cheap -> Expensive)
  3. Caching Prompts and Results (Anthropic, Vertex)
  4. Quantisation and Speculative Decoding
← Back to AI Agents