0Pricing
Production Debugging & Incident Response Playbook · Lesson

Distributed Tracing for Latency Hotspots

Learn to use distributed tracing to follow a single request across services, identify latency hotspots, and correlate traces with logs during production debugging.

Why Distributed Tracing

In a microservice system a single user request can fan out across dozens of services. When it is slow, which service is to blame?

Distributed tracing answers this by attaching a shared trace_id to a request and recording a span for every operation it touches.

  • A trace = the whole request journey
  • A span = one timed unit of work

Anatomy of a Span

Each span carries timing and context so you can reconstruct the call tree.

  • trace_id links all spans of one request
  • span_id identifies the operation
  • parent_id records who called it
  • start/end timestamps give duration
{
  "trace_id": "abc123",
  "span_id": "s2",
  "parent_id": "s1",
  "name": "db.query.users",
  "start_ms": 1042,
  "end_ms": 1310
}

All lessons in this course

  1. Remote Debugging Live Applications
  2. Post-mortem Debugging with Core Dumps
  3. Memory and CPU Profiling Techniques
  4. Distributed Tracing for Latency Hotspots
← Back to Production Debugging & Incident Response Playbook