0PricingLogin
NestJS Enterprise Backend APIs · Lesson

Distributed Tracing with OpenTelemetry

Instrument controllers, providers, and HTTP clients to emit correlated spans across services.

Why Distributed Tracing

In a microservice fleet a single user request might touch a gateway, an orders service, a payments service, and a third-party HTTP API. When latency spikes, logs alone cannot tell you which hop was slow.

Distributed tracing stitches these hops together. Each unit of work becomes a span; spans linked by a shared trace_id form one end-to-end trace.

  • trace_id — the same value across every service in one request
  • span_id — unique per operation
  • parent_span_id — how spans nest into a tree

OpenTelemetry (OTel) is the vendor-neutral standard for producing and propagating these spans, which we'll wire into NestJS.

The Anatomy of a Span

A span is just a typed object describing one operation in time. Before touching NestJS, it helps to model what OTel actually emits. Below is a plain TypeScript sketch of the fields the SDK populates.

Note kind: SERVER spans represent inbound requests, CLIENT spans represent outbound calls. Correlating a CLIENT span in service A with a SERVER span in service B is exactly what context propagation buys you.

type SpanKind = 'SERVER' | 'CLIENT' | 'INTERNAL';

interface Span {
  traceId: string;
  spanId: string;
  parentSpanId?: string;
  name: string;
  kind: SpanKind;
  startTimeMs: number;
  endTimeMs: number;
  attributes: Record<string, string | number | boolean>;
}

function durationMs(span: Span): number {
  return span.endTimeMs - span.startTimeMs;
}

const span: Span = {
  traceId: '4bf92f3577b34da6a3ce929d0e0e4736',
  spanId: '00f067aa0ba902b7',
  name: 'GET /orders/:id',
  kind: 'SERVER',
  startTimeMs: 1000,
  endTimeMs: 1042,
  attributes: { 'http.method': 'GET', 'http.route': '/orders/:id', 'http.status_code': 200 },
};

console.log(`${span.name} took ${durationMs(span)}ms`);

All lessons in this course

  1. Timeouts, Retries, and Bulkheads with Interceptors
  2. Circuit Breakers for Downstream Failures
  3. Distributed Tracing with OpenTelemetry
  4. Defining SLOs and Error Budgets
← Back to NestJS Enterprise Backend APIs