AI Engineering Academy · Lesson

Semantic Search with NumPy

Build a pure-Python semantic search system using NumPy to compute cosine similarity between a query embedding and a collection of document embeddings.

Semantic Search Without a Database

Semantic search finds the most relevant documents for a query based on meaning rather than keyword overlap. The simplest implementation uses NumPy to compute cosine similarity between a query embedding and all document embeddings in memory — no external database required.

This approach works well for up to tens of thousands of documents and is ideal for prototyping before investing in a vector database.

Building the Document Corpus

Start by collecting your documents and generating one embedding per document using the OpenAI API. Store the embeddings as a 2D NumPy array where each row is one document vector. Keep a parallel list of document texts so you can retrieve the original content after finding the best matches.

import numpy as np
from openai import OpenAI

client = OpenAI()

documents = [
    'Python is a high-level programming language.',
    'NumPy provides fast numerical computing for Python.',
    'Embeddings represent text as dense vectors.',
    'Cosine similarity measures angle between vectors.',
    'RAG combines retrieval with language generation.'
]

response = client.embeddings.create(
    model='text-embedding-3-small',
    input=documents
)

corpus_embeddings = np.array([item.embedding for item in response.data])
print(f'Corpus shape: {corpus_embeddings.shape}')  # (5, 1536)

All lessons in this course

← Back to AI Engineering Academy