AI Agents · Lesson

Re-ranking with Cross-Encoders

Retrieve top-50 with cheap embeddings, then re-rank top-5 with a slower cross-encoder for higher precision.

Why Re-Rank?

Embedding similarity is a fast, coarse first pass. It often pulls in chunks that share KEYWORDS with the query but are not actually relevant to the QUESTION.

A re-ranker takes (query, chunk) pairs and computes a much more precise relevance score.

Bi-Encoder vs Cross-Encoder

Two architectures for text similarity:

Bi-encoder — embeds query and chunk separately, computes cosine. Fast (vectorise once, reuse), less accurate.
Cross-encoder — runs (query, chunk) jointly through the model. Slow (per-pair), much more accurate.

All lessons in this course

Re-ranking with Cross-Encoders
HyDE: Hypothetical Document Embeddings
Multi-Vector Retrieval (ColBERT)
RAG Evaluation (RAGAS, Recall@K)

← Back to AI Agents