Query Rewriting and HyDE
Improving retrieval recall.
The Query Is the Weak Link
Retrieval quality is bounded by the query. Raw user queries are often short, ambiguous, full of pronouns, or phrased unlike the documents. Query transformation reshapes the query before retrieval to raise recall and precision.
It is one of the cheapest, highest-leverage upgrades over naive RAG: fix the query and every downstream stage benefits.
def transform_query(raw, history):
# Resolve references, expand, decompose, or hypothesize
# BEFORE embedding and retrieving.
return rewrite(raw, history)Query Rewriting
The simplest transform: an LLM rewrites the user query into a clearer, self-contained, retrieval-friendly form. It fixes typos, expands abbreviations, and removes conversational noise.
This is especially important in multi-turn chat where the latest message is unintelligible without context (How about the second one?).
def rewrite_query(raw):
prompt = (
'Rewrite the user question into a clear, standalone search '
'query optimized for document retrieval. Keep key entities.\n'
'Question: ' + raw
)
return llm(prompt, temperature=0).strip()All lessons in this course
- Beyond Naive RAG
- Re-ranking Retrieved Chunks
- Context Compression
- Query Rewriting and HyDE