DSA Interview Prep · Lesson

Caching, CDNs, and Load Balancing

Add Redis caching layers, push static assets to a CDN, and distribute traffic across replicas with round-robin and consistent-hashing load balancers.

Why Caching Is Essential at Scale

Caching stores copies of frequently accessed data in a faster storage layer so that future requests can be served without hitting the slower backing store (database, external API). At scale, a small number of popular items receive the vast majority of requests — the 80/20 rule (Pareto principle) often applies: 20% of items account for 80% of traffic.

A cache that fits the hot 20% in memory can absorb 80% of database load. This is why adding a Redis cache often reduces database CPU by 70–90% and cuts p99 latency from 10ms to under 1ms for cache hits — without changing the database or application logic significantly.

# Demonstrating the 80/20 caching benefit
import random

# Simulate 1000 requests to 100 items with Zipf-like distribution
def zipf_sample(n_items, n_requests):
    access_counts = {}
    weights = [1.0 / (i + 1) for i in range(n_items)]  # Zipf: item 0 most popular
    total = sum(weights)
    probs = [w / total for w in weights]
    for _ in range(n_requests):
        item = random.choices(range(n_items), weights=probs)[0]
        access_counts[item] = access_counts.get(item, 0) + 1
    return access_counts

random.seed(42)
counts = zipf_sample(100, 10000)
top_20_items = sorted(counts, key=counts.get, reverse=True)[:20]
top_20_requests = sum(counts[i] for i in top_20_items)
print(f'Top 20% of items ({20} of 100) handle {top_20_requests/100:.1f}% of requests')

Cache-Aside Pattern (Lazy Loading)

The cache-aside pattern (also called lazy loading) is the most common caching strategy. The application code is responsible for managing the cache: on a read, check the cache first. On a cache hit, return immediately. On a cache miss, fetch from the database, write to cache, then return. On a write, update the database and invalidate (delete) the cache entry so the next read refreshes it.

This pattern ensures the cache only holds data that was actually requested (no unnecessary pre-loading) and stays consistent with the database via invalidation. The trade-off: first access after cache miss pays the full database cost (cold start).

# Cache-aside pattern in Python
class CacheAsideService:
    def __init__(self, db, cache):
        self.db = db
        self.cache = cache   # e.g., Redis client

    def get_user(self, user_id):
        cache_key = f'user:{user_id}'
        # 1. Check cache
        cached = self.cache.get(cache_key)
        if cached:
            return cached    # cache hit
        # 2. Cache miss: fetch from DB
        user = self.db.query('SELECT * FROM users WHERE id=%s', user_id)
        # 3. Write to cache with TTL
        self.cache.set(cache_key, user, ttl=3600)  # 1 hour TTL
        return user

    def update_user(self, user_id, data):
        # 1. Write to DB
        self.db.execute('UPDATE users SET ... WHERE id=%s', user_id, data)
        # 2. Invalidate cache (delete, not update)
        self.cache.delete(f'user:{user_id}')
        # Next read will re-populate cache from DB

print('Cache-aside: READ from cache, miss? load from DB + write cache')
print('         WRITE to DB, then DELETE from cache (invalidate)')

All lessons in this course

The System Design Interview Framework
Scalable Data Storage: SQL vs NoSQL
Caching, CDNs, and Load Balancing
Design Rate Limiter and Design Twitter Feed

← Back to DSA Interview Prep