Streaming in FastAPI with Server-Sent Events
Build a FastAPI endpoint that proxies LLM streaming responses to a browser client using StreamingResponse and the text/event-stream content type.
Why Server-Sent Events for LLM Streaming
Server-Sent Events (SSE) is a W3C standard that lets a server push a stream of text events to a browser client over a single long-lived HTTP connection. Unlike WebSockets, SSE is one-directional (server to client), works over standard HTTP/1.1, automatically reconnects on disconnection, and requires no special browser library. These properties make it the ideal transport for streaming LLM tokens from a FastAPI backend to a web frontend.
SSE Wire Format
SSE sends text data formatted as a series of fields separated by newlines. Each event contains an optional event type field, a data field with the payload, and an optional id for reconnection. Events are separated by a blank line. For LLM streaming, send each token as a data: token_text\n\n line and a special data: [DONE]\n\n event at the end to signal stream completion.
# SSE wire format example
'''
data: The\n\n
data: capital\n\n
data: of\n\n
data: France\n\n
data: is\n\n
data: Paris\n\n
data: [DONE]\n\n
'''
# Each 'data:' line is one event.
# The double newline (\n\n) terminates each event.
# The client receives these as EventSource message events.
# The content-type must be 'text/event-stream'.All lessons in this course
- Understanding Token Streaming
- Consuming Streams with the Python SDK
- Streaming in FastAPI with Server-Sent Events
- Handling Tool Calls in Streamed Responses