Logging and Documentation Strategies
Recording prompt versions, inputs, and outputs for reproducible debugging.
Why Prompt Logging Matters
Without logging, prompt failures are invisible until a user reports them. With logging, you can:
- Detect regressions the moment they occur
- Reproduce any past failure exactly as it happened
- Measure improvement over time as prompts evolve
- Audit model behavior for compliance or safety
Logging is not optional for production prompt systems — it is the foundation of reliable LLM applications.
The Minimum Viable Log Entry
Every prompt interaction should log these fields at minimum:
timestamp: ISO 8601 UTCprompt_id: which prompt template was usedmodel: exact model name and versiontemperature: sampling parameterinput: the user message (or a hash if PII)output: the model responselatency_ms: response timetokens_used: input + output tokens
import time, json
from datetime import datetime, timezone
def logged_call(prompt_id, system_prompt, user_message, model='gpt-4o', temperature=0.7):
start = time.time()
resp = client.chat.completions.create(
model=model,
messages=[
{'role': 'system', 'content': system_prompt},
{'role': 'user', 'content': user_message}
],
temperature=temperature
)
latency = int((time.time() - start) * 1000)
output = resp.choices[0].message.content
log_entry = {
'timestamp': datetime.now(timezone.utc).isoformat(),
'prompt_id': prompt_id,
'model': model,
'temperature': temperature,
'input': user_message,
'output': output,
'latency_ms': latency,
'input_tokens': resp.usage.prompt_tokens,
'output_tokens': resp.usage.completion_tokens
}
append_log(log_entry)
return outputAll lessons in this course
- Diagnosing Unexpected Outputs
- Root Cause Analysis for Prompts
- Systematic Debugging Approach
- Logging and Documentation Strategies