Reranking Retrieved Results
Boost RAG accuracy by reranking initial vector search candidates with cross-encoder models before passing context to the LLM.
The Reranking Idea
Vector search is fast but approximate. Reranking takes the top candidates and reorders them with a more accurate, slower model — the two-stage retrieve-then-rerank pattern.
Bi-Encoder vs Cross-Encoder
Bi-encoders embed query and document separately (fast, used for retrieval). Cross-encoders score query+document together (slow, far more accurate) — ideal for reranking a small set.
All lessons in this course
- Query Transformation Techniques
- Multi-Stage RAG Pipelines
- Evaluating RAG System Performance
- Reranking Retrieved Results