0Pricing
FastAPI Backend Development Bootcamp · Lesson

Streaming Responses and Range Requests

Serve large files with StreamingResponse and support HTTP range requests for resumable downloads.

Why Stream Responses?

By default, returning a file from FastAPI means loading the entire payload into memory before sending it. For a 2 GB video that is a disaster: memory spikes, slow first byte, and crashes under concurrency.

Streaming solves this by sending the body in small chunks as they become available. The server holds only one chunk at a time, and the client starts receiving data almost immediately.

  • StreamingResponse — wraps a generator/iterator that yields bytes.
  • FileResponse — a convenience for serving a file from disk efficiently.
  • Range requests — let clients fetch only part of a file (seeking, resuming).

This lesson builds all three, ending with resumable downloads.

A Generator That Yields Bytes

Streaming starts with an iterable of bytes. The cleanest source is a Python generator that reads a file in fixed-size chunks instead of all at once.

Here is the core idea, isolated from any framework. The generator yields 1 MB at a time, so peak memory stays tiny no matter how large the file is.

def file_chunks(path, chunk_size=1024 * 1024):
    with open(path, "rb") as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break
            yield chunk


if __name__ == "__main__":
    import os
    with open("sample.bin", "wb") as f:
        f.write(b"x" * (3 * 1024 * 1024 + 17))

    total = 0
    pieces = 0
    for chunk in file_chunks("sample.bin"):
        total += len(chunk)
        pieces += 1
    print("bytes:", total)
    print("chunks:", pieces)
    os.remove("sample.bin")

All lessons in this course

  1. Multipart Uploads and Content Validation
  2. Streaming Responses and Range Requests
  3. Offloading Storage to S3-Compatible Buckets
  4. Async Image and Document Transformation
← Back to FastAPI Backend Development Bootcamp