AI Prompt Engineering · Lesson

Logging and Documentation Strategies

Recording prompt versions, inputs, and outputs for reproducible debugging.

Why Prompt Logging Matters

Without logging, prompt failures are invisible until a user reports them. With logging, you can:

Detect regressions the moment they occur
Reproduce any past failure exactly as it happened
Measure improvement over time as prompts evolve
Audit model behavior for compliance or safety

Logging is not optional for production prompt systems — it is the foundation of reliable LLM applications.

The Minimum Viable Log Entry

Every prompt interaction should log these fields at minimum:

timestamp: ISO 8601 UTC
prompt_id: which prompt template was used
model: exact model name and version
temperature: sampling parameter
input: the user message (or a hash if PII)
output: the model response
latency_ms: response time
tokens_used: input + output tokens

import time, json
from datetime import datetime, timezone

def logged_call(prompt_id, system_prompt, user_message, model='gpt-4o', temperature=0.7):
    start = time.time()
    resp = client.chat.completions.create(
        model=model,
        messages=[
            {'role': 'system', 'content': system_prompt},
            {'role': 'user', 'content': user_message}
        ],
        temperature=temperature
    )
    latency = int((time.time() - start) * 1000)
    output = resp.choices[0].message.content
    log_entry = {
        'timestamp': datetime.now(timezone.utc).isoformat(),
        'prompt_id': prompt_id,
        'model': model,
        'temperature': temperature,
        'input': user_message,
        'output': output,
        'latency_ms': latency,
        'input_tokens': resp.usage.prompt_tokens,
        'output_tokens': resp.usage.completion_tokens
    }
    append_log(log_entry)
    return output

All lessons in this course

← Back to AI Prompt Engineering