Throttling, Caching, and Usage Plans
Protect backends with burst and steady-state throttling limits, enable response caching, and create usage plans with API keys for partners.
Why Throttling Is Essential
Without throttling, a single misbehaving client or a traffic spike could overwhelm your backend services—Lambda concurrency, RDS connections, or downstream APIs. Throttling in API Gateway limits the number of requests per second and allows short bursts above the steady-state rate. Throttled requests receive a 429 Too Many Requests response immediately, without the request reaching your backend, protecting downstream resources from overload.
Account-Level and Stage-Level Throttling
Throttling operates at multiple levels. The account-level limit is 10,000 requests per second (RPS) with a burst of 5,000 requests (soft limit, can be increased). At the stage level, you can set a default throttle rate (RPS) and burst limit that applies to all methods in the stage. At the method level, you can override the stage defaults for specific endpoints—for example, giving a read-heavy GET endpoint a higher rate limit than a write-heavy POST endpoint.
aws apigateway update-stage \
--rest-api-id 'abc123' \
--stage-name 'prod' \
--patch-operations \
'op=replace,path=/*/*/throttling/rateLimit,value=1000' \
'op=replace,path=/*/*/throttling/burstLimit,value=2000'All lessons in this course
- REST API vs HTTP API vs WebSocket API
- Integrations: Lambda, HTTP, and Mock
- Authorization: IAM, Lambda Authorizers, and Cognito
- Throttling, Caching, and Usage Plans