AI Agents · Lesson

Identifying Slow and Expensive Steps

Waterfall profiling: where is the agent spending its time and budget?

Performance Profiling for Agents

Agent performance issues fall into two categories: slow steps (high latency) and expensive steps (high token cost). Both hurt user experience and operational costs. The first step is measurement.

Timing Each Step

Use time.perf_counter() for high-precision timing. It measures wall-clock time including I/O waits — exactly what matters for agent step latency.

import time
from contextlib import contextmanager

@contextmanager
def timer(step_name: str, timings: dict):
    start = time.perf_counter()
    try:
        yield
    finally:
        end = time.perf_counter()
        duration_ms = (end - start) * 1000
        timings[step_name] = duration_ms
        print(f'{step_name}: {duration_ms:.1f}ms')

# Usage
timings = {}

with timer('entity_extraction', timings):
    time.sleep(0.05)  # Simulate work

with timer('vector_search', timings):
    time.sleep(0.12)  # Simulate work

with timer('llm_call', timings):
    time.sleep(0.80)  # Simulate LLM latency

print('\nTimings:', timings)
print('Slowest step:', max(timings, key=timings.get))

All lessons in this course

Trace Analysis with LangSmith and Langfuse
Per-Step Token and Cost Profiling
Identifying Slow and Expensive Steps
Root Cause Analysis for Agent Failures

← Back to AI Agents