Diagnosing Unexpected Outputs
Classifying failure modes: wrong answer, wrong format, off-topic, hallucination.
When Prompts Fail
Even carefully crafted prompts produce wrong results. Diagnosing failures requires a taxonomy — a classification of what type of failure occurred. Without classification, debugging is guesswork. The four main failure categories are: wrong answer, wrong format, off-topic response, and hallucination.
Failure Type 1: Wrong Answer
A wrong answer is a factual error — the model gave a response in the correct format and on the correct topic, but the content is incorrect.
Examples: dates, statistics, names, code that has a logic bug. This is the hardest failure to detect automatically because the output looks correct on the surface.
- Cause: training data cutoff, rare fact, or multi-step reasoning error
- Detection: compare against ground truth, human review, or a verification LLM call
# Example: wrong answer failure
prompt = 'What year was Python first released?'
response = 'Python was first released in 1994.' # Wrong — it was 1991
# Ground truth check
GROUND_TRUTH = '1991'
correct = GROUND_TRUTH in response
print(f'Correct: {correct}') # FalseAll lessons in this course
- Diagnosing Unexpected Outputs
- Root Cause Analysis for Prompts
- Systematic Debugging Approach
- Logging and Documentation Strategies