AI Prompt Engineering · Lesson

Reading and Evaluating AI Outputs

Criteria for judging relevance, accuracy, and completeness of AI responses.

Why Evaluation Matters

Getting an AI response is only half the job. The second half is judging whether that response is actually good.

Without a clear evaluation framework, you might accept a response that looks polished but answers the wrong question, or reject a response that is genuinely useful because it isn't formatted the way you expected.

Developing a reliable eye for AI output quality is one of the most practical skills in prompt engineering.

The Four Evaluation Criteria

When you read an AI response, judge it on four dimensions:

Relevance — Did it answer the actual question you asked?
Accuracy — Is the information factually correct?
Completeness — Did it cover all parts of the request?
Format — Is it structured the way you need it to be?

A response can score high on three and fail on the fourth. Each criterion matters independently.

All lessons in this course

← Back to AI Prompt Engineering