AI Agents · Lesson

Multi-Vector Retrieval (ColBERT)

Index multiple vectors per document (one per token or per chunk) for fine-grained matching.

One Vector Is Not Enough

Standard RAG embeds each chunk into a single vector. That vector averages everything in the chunk — losing fine-grained signal.

Multi-vector retrieval stores MULTIPLE vectors per document and matches them more precisely.

ColBERT (Khattab & Zaharia, 2020) embeds each TOKEN of the document and each token of the query, then computes max-similarity:

score(q, d) = sum over q_token in q: max over d_token in d: cos(q_token, d_token)