What Are Vector Embeddings?
Understand how neural networks map text to dense vectors in high-dimensional space, why semantically similar texts produce nearby vectors, and what cosine similarity measures.
The Meaning Behind Numbers
A vector embedding is a list of numbers (a vector) that represents the meaning of a piece of text. Instead of storing words as raw strings, neural networks learn to encode semantic meaning into dense numerical arrays. Two texts with similar meaning will produce vectors that are close together in this high-dimensional space.
High-Dimensional Vector Space
Modern embeddings typically have 768 to 3072 dimensions. Each dimension captures some latent feature of meaning — topics, sentiment, style, and relationships — without being explicitly programmed. The result is a rich geometric space where semantic relationships become measurable distances.
For example, text-embedding-3-small produces 1536-dimensional vectors, while text-embedding-3-large produces 3072-dimensional vectors.
All lessons in this course
- What Are Vector Embeddings?
- Generating Embeddings with OpenAI
- Semantic Search with NumPy
- Clustering and Visualizing Embeddings