0PricingLogin
AI Engineering Academy · Lesson

Fallback Providers and Circuit Breakers

Build a provider cascade that automatically fails over from OpenAI to Anthropic to a local model when the primary provider is slow or unavailable, using a circuit breaker pattern.

Single Provider Risk

Relying on a single LLM provider creates a single point of failure. OpenAI has experienced outages that took minutes to hours to resolve. If your entire application depends on GPT-4o being available, any provider incident immediately translates to user-facing downtime. A fallback provider strategy maintains service continuity by routing to alternative providers when the primary fails.

Defining a Provider Cascade

A provider cascade is an ordered list of providers and models tried in sequence. When the primary fails or times out, the system automatically tries the next provider. A typical cascade might be: OpenAI GPT-4o → Anthropic Claude 3.5 Sonnet → a locally deployed Llama model. Each level is a fallback with the local model serving as the last resort that cannot go down.

from dataclasses import dataclass
from typing import Optional

@dataclass
class Provider:
    name: str
    base_url: Optional[str]
    api_key_env: str
    model: str
    priority: int  # lower = higher priority

CASCADE = [
    Provider('openai',    None,                              'OPENAI_API_KEY',    'gpt-4o',              1),
    Provider('anthropic', 'https://api.anthropic.com/v1',   'ANTHROPIC_API_KEY', 'claude-3-5-sonnet',   2),
    Provider('local',     'http://localhost:8000/v1',        'LOCAL_KEY',         'llama-3.1-8b-inst',   3),
]

All lessons in this course

  1. Measuring LLM Latency: TTFT and TPOT
  2. Load Balancing and Multi-Key Strategies
  3. Fallback Providers and Circuit Breakers
  4. Timeout Budgets and Graceful Degradation
← Back to AI Engineering Academy