Health Checks, Circuit Breakers, and Retry Logic
Use ELB health checks, Route 53 endpoint checks, and application-level circuit breakers to detect failures and reroute traffic automatically.
Why Automated Failure Detection Matters
In distributed systems, components fail continuously — instances crash, network partitions occur, downstream services become overloaded. Without automated failure detection, traffic continues flowing to failed components, causing cascading failures. AWS provides multiple layers of health checking: ELB health checks detect unhealthy instances, Route 53 health checks detect unhealthy endpoints, and Auto Scaling replaces failed instances. Application-level patterns like circuit breakers and retries complete the resilience picture.
ELB Health Checks
Elastic Load Balancer health checks periodically send requests to registered targets to determine if they are healthy. You configure the health check path (e.g., /health), protocol, port, interval (default 30 seconds), and healthy/unhealthy threshold (number of consecutive successes/failures). When a target fails health checks, the ELB stops routing traffic to it. The target is re-evaluated continuously and added back once it passes the healthy threshold.
# Configure ALB target group health check
aws elbv2 modify-target-group \
--target-group-arn arn:aws:elasticloadbalancing::123:targetgroup/my-tg/abc \
--health-check-protocol HTTPS \
--health-check-port 443 \
--health-check-path /health \
--health-check-interval-seconds 15 \
--healthy-threshold-count 2 \
--unhealthy-threshold-count 3 \
--matcher HttpCode=200All lessons in this course
- HA vs Fault Tolerance: Definitions and Trade-offs
- Multi-AZ Patterns for Stateful Services
- Multi-Region Active-Active and Active-Passive
- Health Checks, Circuit Breakers, and Retry Logic