Distributed Tracing for Latency Hotspots
Learn to use distributed tracing to follow a single request across services, identify latency hotspots, and correlate traces with logs during production debugging.
Why Distributed Tracing
In a microservice system a single user request can fan out across dozens of services. When it is slow, which service is to blame?
Distributed tracing answers this by attaching a shared trace_id to a request and recording a span for every operation it touches.
- A trace = the whole request journey
- A span = one timed unit of work
Anatomy of a Span
Each span carries timing and context so you can reconstruct the call tree.
trace_idlinks all spans of one requestspan_ididentifies the operationparent_idrecords who called it- start/end timestamps give duration
{
"trace_id": "abc123",
"span_id": "s2",
"parent_id": "s1",
"name": "db.query.users",
"start_ms": 1042,
"end_ms": 1310
}All lessons in this course
- Remote Debugging Live Applications
- Post-mortem Debugging with Core Dumps
- Memory and CPU Profiling Techniques
- Distributed Tracing for Latency Hotspots