AI Prompt Engineering · Lesson

Zero, One, and Few-Shot

Choosing the number of examples.

The Shot Spectrum

Shots denote the number of labeled demonstrations placed in the prompt before the live query. Zero-shot relies entirely on the model's pretrained priors; few-shot conditions the model on a small task-specific distribution at inference time without any weight updates.

This is in-context learning (ICL): the transformer treats the examples as part of the sequence and implicitly performs a kind of meta-learned regression over them. The choice of k (number of shots) is a hyperparameter you tune empirically, not a fixed best practice.

from dataclasses import dataclass

@dataclass
class ICLConfig:
    k: int           # number of demonstrations
    selection: str   # 'static' | 'dynamic'
    order: str       # 'random' | 'similarity' | 'curriculum'

# Zero-shot is simply k=0
cfg = ICLConfig(k=0, selection='static', order='random')

When Zero-Shot Wins

Prefer zero-shot when the task is well represented in pretraining (summarization, translation, common classification) and when examples would bias the output format. For instruction-tuned models, a crisp directive plus an output schema often beats examples that subtly anchor style.

Zero-shot also minimizes token cost and latency, and avoids majority-label bias where the model over-predicts whichever class dominates your demonstrations.

# Zero-shot with explicit schema beats vague few-shot
PROMPT = (
    'Classify sentiment as POSITIVE, NEGATIVE, or NEUTRAL.\n'
    'Respond with only the label.\n\n'
    'Text: ' + user_text + '\nLabel:'
)

All lessons in this course

← Back to AI Prompt Engineering