Rate Limiting and Retry Logic
Exponential backoff, 429 handling, and respectful API consumption.
What Is Rate Limiting?
Rate limiting is how APIs protect themselves from being overwhelmed. When your agent sends too many requests too quickly, the API returns 429 Too Many Requests. Common limits include requests per second, per minute, or per day.
Ignoring rate limits results in blocked agents, revoked API keys, and extra charges.
import requests
response = requests.get(
'https://api.example.com/data',
headers={'Authorization': 'Bearer YOUR_KEY'}
)
if response.status_code == 429:
print('Rate limit exceeded!')
# Check headers for limit details
limit = response.headers.get('X-RateLimit-Limit')
remaining = response.headers.get('X-RateLimit-Remaining')
reset = response.headers.get('X-RateLimit-Reset')
print(f'Limit: {limit}, Remaining: {remaining}, Reset: {reset}')The Retry-After Header
When an API returns 429, it often includes a Retry-After header telling you exactly how many seconds to wait before retrying. Always respect this header — ignoring it and retrying immediately will just get you another 429.
import requests
import time
def request_with_retry_after(url, headers):
response = requests.get(url, headers=headers)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
print(f'Rate limited. Waiting {retry_after} seconds...')
time.sleep(retry_after)
# Retry once after waiting
response = requests.get(url, headers=headers)
response.raise_for_status()
return response.json()All lessons in this course
- REST API Fundamentals for Agent Developers
- Authentication: API Keys and OAuth
- Handling API Responses and Errors
- Rate Limiting and Retry Logic