Multi-Vector Retrieval (ColBERT)
Index multiple vectors per document (one per token or per chunk) for fine-grained matching.
One Vector Is Not Enough
Standard RAG embeds each chunk into a single vector. That vector averages everything in the chunk — losing fine-grained signal.
Multi-vector retrieval stores MULTIPLE vectors per document and matches them more precisely.
ColBERT Idea
ColBERT (Khattab & Zaharia, 2020) embeds each TOKEN of the document and each token of the query, then computes max-similarity:
score(q, d) = sum over q_token in q: max over d_token in d: cos(q_token, d_token)All lessons in this course
- Re-ranking with Cross-Encoders
- HyDE: Hypothetical Document Embeddings
- Multi-Vector Retrieval (ColBERT)
- RAG Evaluation (RAGAS, Recall@K)