AWS Solutions Architect · Lesson

HA vs Fault Tolerance: Definitions and Trade-offs

Clarify the distinction between high availability (minimising downtime) and fault tolerance (zero-downtime through redundancy), and see how cost scales with each level.

HA vs Fault Tolerance Overview

High Availability (HA) and Fault Tolerance (FT) are two distinct reliability goals that architects often confuse. High availability means a system experiences minimal downtime — it can tolerate failures but may have brief interruptions during recovery. Fault tolerance means a system continues operating without any interruption even when components fail, by having fully redundant paths that take over instantly.

Defining Availability Percentages

Availability is measured as a percentage of uptime over a year. 99.9% availability (three nines) means roughly 8.7 hours of downtime per year, while 99.99% (four nines) allows only 52.6 minutes. 99.999% (five nines) allows just 5.26 minutes. Each additional nine typically requires more redundancy, automation, and cost. The SAA-C03 exam often asks you to identify which architecture meets a given availability target.

# Availability calculations
# 99.9%  → 8.76 hours/year downtime
# 99.99% → 52.6 minutes/year downtime
# 99.999% → 5.26 minutes/year downtime

# Formula: downtime = (1 - availability) * 8760 hours

All lessons in this course

HA vs Fault Tolerance: Definitions and Trade-offs
Multi-AZ Patterns for Stateful Services
Multi-Region Active-Active and Active-Passive
Health Checks, Circuit Breakers, and Retry Logic

← Back to AWS Solutions Architect