0PricingLogin
AI Agents · Lesson

HyDE: Hypothetical Document Embeddings

Generate a fake 'ideal' answer with the LLM, embed it, and search with that — beats short-query embeddings.

The Problem with Query Embeddings

User queries are usually short and look nothing like the documents you want to retrieve.

  • Query: "fix my keyboard"
  • Doc: "If your mechanical keyboard's keys are sticking, you can clean the switches with isopropyl alcohol..."

The vectors are semantically distant; retrieval misses the doc.

HyDE Idea

HyDE = Hypothetical Document Embeddings (Gao et al. 2022).

  1. Ask the LLM to write a fake "ideal answer" to the query
  2. Embed the fake answer (not the query)
  3. Use that vector to search the real corpus

All lessons in this course

  1. Re-ranking with Cross-Encoders
  2. HyDE: Hypothetical Document Embeddings
  3. Multi-Vector Retrieval (ColBERT)
  4. RAG Evaluation (RAGAS, Recall@K)
← Back to AI Agents