RAG Evaluation (RAGAS, Recall@K)
Measure faithfulness, answer relevance, context precision, and recall@K to know if changes help.
You Cannot Improve What You Cannot Measure
RAG has many knobs: chunk size, top-K, re-ranker, prompt, model. Without metrics, every change is guesswork.
Key Metrics for RAG
- Retrieval: did we fetch the right chunks?
- Faithfulness: does the answer stay grounded in the chunks?
- Answer relevance: does the answer address the question?
- Context precision/recall: ratio of useful chunks fetched
All lessons in this course
- Re-ranking with Cross-Encoders
- HyDE: Hypothetical Document Embeddings
- Multi-Vector Retrieval (ColBERT)
- RAG Evaluation (RAGAS, Recall@K)