AWS Solutions Architect · Lesson

Concurrency, Throttling, and Reserved Concurrency

Understand how Lambda scales concurrently, set reserved concurrency to protect downstream services, and handle throttle errors.

How Lambda Scales Concurrently

Lambda scales by running multiple concurrent executions of your function—one per simultaneous event. When 100 requests arrive at the same time, Lambda runs 100 parallel instances of your function. AWS manages the underlying infrastructure automatically. The account-level concurrency limit is 1,000 concurrent executions per Region by default (soft limit, can be increased via a service quota request).

Concurrency Calculation

Concurrency is calculated as: Concurrency = Requests per second × Average duration in seconds. If your function handles 500 requests/second and each takes 0.2 seconds, you need 100 concurrent executions. Understanding this formula helps you predict whether your account limits are sufficient and whether you need to request quota increases before a high-traffic event.

# Example concurrency calculation
# RPS = 500, avg_duration = 0.2s
# Concurrency = 500 * 0.2 = 100

# To check current concurrency limits:
aws lambda get-account-settings

All lessons in this course

Lambda Functions: Runtimes, Triggers, and Handlers
Concurrency, Throttling, and Reserved Concurrency
Lambda Layers and Deployment Packages
Lambda@Edge and Event-Driven Patterns

← Back to AWS Solutions Architect