When Prompting Is Enough
Cost and flexibility trade-offs.
The Default Should Be Prompting
Before reaching for fine-tuning, treat prompting as the null hypothesis. Modern frontier models hold enough latent capability that most tasks are a retrieval-and-instruction problem, not a weight-update problem.
The expensive mistake teams make is jumping to a training run when a well-structured prompt, a few exemplars, and tool access would have closed the gap at zero marginal training cost. Fine-tuning is justified only when prompting provably hits a ceiling.
- Prompting changes per request, fine-tuning changes the model
- Prompting is reversible in seconds; a tuned checkpoint is a committed artifact
- Start cheap, escalate only on evidence
The Three Cost Axes
Compare approaches across three independent cost axes, not just dollars:
- Iteration cost - how fast can you change behavior? Prompting: minutes. Fine-tuning: hours-to-days per cycle.
- Inference cost - prompting pays per-token for long instructions and exemplars on every call; a tuned model can fold that behavior into weights and shrink the prompt.
- Maintenance cost - a prompt lives in source control and is auditable; a checkpoint must be re-tuned whenever the base model is deprecated.
Prompting wins on iteration and maintenance; fine-tuning can win on inference cost at high volume.
All lessons in this course
- When Prompting Is Enough
- When to Fine-Tune
- Hybrid: Prompt + Light Tuning
- Evaluating the Decision