Distributed Tracing with OpenTelemetry
Instrument controllers, providers, and HTTP clients to emit correlated spans across services.
Why Distributed Tracing
In a microservice fleet a single user request might touch a gateway, an orders service, a payments service, and a third-party HTTP API. When latency spikes, logs alone cannot tell you which hop was slow.
Distributed tracing stitches these hops together. Each unit of work becomes a span; spans linked by a shared trace_id form one end-to-end trace.
trace_id— the same value across every service in one requestspan_id— unique per operationparent_span_id— how spans nest into a tree
OpenTelemetry (OTel) is the vendor-neutral standard for producing and propagating these spans, which we'll wire into NestJS.
The Anatomy of a Span
A span is just a typed object describing one operation in time. Before touching NestJS, it helps to model what OTel actually emits. Below is a plain TypeScript sketch of the fields the SDK populates.
Note kind: SERVER spans represent inbound requests, CLIENT spans represent outbound calls. Correlating a CLIENT span in service A with a SERVER span in service B is exactly what context propagation buys you.
type SpanKind = 'SERVER' | 'CLIENT' | 'INTERNAL';
interface Span {
traceId: string;
spanId: string;
parentSpanId?: string;
name: string;
kind: SpanKind;
startTimeMs: number;
endTimeMs: number;
attributes: Record<string, string | number | boolean>;
}
function durationMs(span: Span): number {
return span.endTimeMs - span.startTimeMs;
}
const span: Span = {
traceId: '4bf92f3577b34da6a3ce929d0e0e4736',
spanId: '00f067aa0ba902b7',
name: 'GET /orders/:id',
kind: 'SERVER',
startTimeMs: 1000,
endTimeMs: 1042,
attributes: { 'http.method': 'GET', 'http.route': '/orders/:id', 'http.status_code': 200 },
};
console.log(`${span.name} took ${durationMs(span)}ms`);All lessons in this course
- Timeouts, Retries, and Bulkheads with Interceptors
- Circuit Breakers for Downstream Failures
- Distributed Tracing with OpenTelemetry
- Defining SLOs and Error Budgets