Anatomy of an API Request
model, max_tokens, system, messages, tools, tool_choice.
The Single Endpoint
Every call to Claude is one request to the Messages API. As an architect, you don't memorize syntax — you reason about six fields that shape the whole interaction: model, max_tokens, system, messages, tools, and tool_choice.
Get these right and everything downstream — agents, tool loops, structured output — falls into place. This lesson walks through each field and the decision it represents.
from anthropic import Anthropic
client = Anthropic()
response = client.messages.create(
model="claude-opus-4-8",
max_tokens=1024,
system="You are a concise assistant.",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.content[0].text)model — Which Brain
The model field selects which Claude does the work. It's a single string, and the choice is a real architectural trade-off: capability versus latency versus cost.
- Opus — most capable, best for long-horizon agentic and hard reasoning.
- Sonnet — strong balance of speed and intelligence.
- Haiku — fastest and cheapest for simple, high-volume tasks.
You can change the model per request, so route easy tasks to a cheaper model and hard ones to a stronger one.
# Same request shape, different routing decision
response = client.messages.create(
model="claude-opus-4-8", # swap to a cheaper model for simple tasks
max_tokens=1024,
messages=[{"role": "user", "content": "Summarize this ticket."}],
)All lessons in this course
- The Claude Model Family
- Anatomy of an API Request
- Stop Reasons Explained
- Tokens, Context Windows & Cost