Streaming Responses (SSE)
Stream LLM output token-by-token over Server-Sent Events for snappy UIs and progress indication.
Why Stream?
Without streaming, you wait for the full response before showing anything. With a 50-token response that takes 5 seconds, the user sees nothing for 5 seconds.
Streaming shows tokens as they arrive — same total time, but the UX feels instant.
Server-Sent Events (SSE)
OpenAI and Anthropic stream over SSE — a one-way HTTP protocol where the server pushes events with format data: {...}\n\n.
Most SDKs hide SSE behind an iterator interface.