Health Probes, Readiness Gates, and Wait Loops
Implement dependency wait loops and liveness probes that make containerized services start reliably.
Why Services Fail at Startup
In containerized environments, services rarely start in isolation. A web application needs its database ready, a microservice needs its message broker available, and a worker needs its cache populated before it can process jobs.
Without coordination, containers start in parallel and the first requests arrive before dependencies are healthy. The result: connection refused errors, null pointer exceptions, and corrupted startup state that require manual restarts.
- Liveness probe — is the process still running and not deadlocked?
- Readiness probe — is the service ready to accept traffic?
- Wait loop — a startup script that blocks until dependencies are reachable
Kubernetes provides built-in probe mechanisms, but the shell scripts that power initContainers, entrypoint.sh wrappers, and standalone health checks are written in Bash. Mastering them is a core DevOps skill.
The wait-for Pattern
The simplest dependency wait loop polls a target on a fixed interval until it becomes reachable. The canonical shape uses a while loop with nc (netcat) or curl to probe a TCP port or HTTP endpoint.
Key design decisions:
- Timeout — bail out after N seconds so a broken dependency does not hang the pod forever
- Backoff interval — sleep between probes to avoid hammering a recovering service
- Exit code — exit 1 on timeout so the container restarts (or init container fails loudly)
The snippet below waits up to 60 seconds for a TCP port to accept connections before launching the main process.
#!/usr/bin/env bash
set -euo pipefail
HOST="${DB_HOST:-postgres}"
PORT="${DB_PORT:-5432}"
TIMEOUT=60
INTERVAL=2
ELAPSED=0
echo "[wait-for] Waiting for ${HOST}:${PORT}..."
until nc -z "$HOST" "$PORT" 2>/dev/null; do
if (( ELAPSED >= TIMEOUT )); then
echo "[wait-for] Timed out after ${TIMEOUT}s waiting for ${HOST}:${PORT}" >&2
exit 1
fi
echo "[wait-for] ${HOST}:${PORT} not ready — retrying in ${INTERVAL}s (${ELAPSED}s elapsed)"
sleep "$INTERVAL"
(( ELAPSED += INTERVAL ))
done
echo "[wait-for] ${HOST}:${PORT} is up — continuing"
exec "$@"All lessons in this course
- Writing Lean Dockerfiles and Shell Entrypoints
- Templating Configs with envsubst and heredocs
- Scripting Cloud Resources via CLI and jq
- Health Probes, Readiness Gates, and Wait Loops