AI Agents · Lesson

Trade-offs: Latency, Cost, Capability

Open weights save money but cost engineer hours; benchmark before committing.

Three Axes, Pick Two

Every model choice trades off:

No model is best on all three.

	Top tier	Mid tier	Small tier
Closed	GPT-4o, Claude Sonnet 4.5	GPT-4o-mini, Haiku	—
Open	Llama 3.1 405B	Llama 3.1 70B, Qwen 2.5 72B	Llama 3.1 8B, Phi-3