0PricingLogin
AI Engineering Academy · Lesson

The Code Execution Loop

Design the write-execute-observe loop where the agent generates code, a sandboxed executor runs it, stdout and stderr are captured and fed back as observations, and the agent fixes errors.

Code Agents: Writing and Running Code

A code execution agent is a special type of AI agent that solves problems by writing code, running it, observing the output, and iterating until the task is complete. Unlike agents that only use pre-defined tools, code agents create new computational tools on the fly. This makes them extraordinarily flexible: any task that can be programmed can be attempted by a code agent.

The Write-Execute-Observe Loop

The core of a code execution agent is a loop with three steps: Write — the LLM generates Python code to solve the current step of the task. Execute — the code is run in a sandboxed environment and stdout/stderr are captured. Observe — the execution output is fed back to the LLM as a new observation, which it uses to decide what to write next. This loop continues until the task is complete or an iteration limit is reached.

def code_execution_loop(task: str, max_iterations=10):
    messages = [
        {'role': 'system', 'content': 'You are a Python coding agent. Write code to solve tasks step by step.'},
        {'role': 'user', 'content': task}
    ]
    
    for i in range(max_iterations):
        # WRITE: LLM generates code
        response = llm.complete(messages)
        code = extract_code_block(response)
        
        if not code:
            return response  # LLM gave a final answer without code
        
        # EXECUTE: run the code
        output, error = execute_safely(code)
        
        # OBSERVE: feed output back
        observation = f'Output:\n{output}' if not error else f'Error:\n{error}'
        messages.append({'role': 'assistant', 'content': response})
        messages.append({'role': 'user', 'content': observation})
    
    return 'Max iterations reached'

All lessons in this course

  1. The Code Execution Loop
  2. Sandboxing with Docker and RestrictedPython
  3. State Management Across Execution Steps
  4. Building a Data Analysis Agent
← Back to AI Engineering Academy