Building a Vocabulary
Map every word to a fixed index.
What Is a Vocabulary?
A vocabulary is the full set of unique words your model knows. Each word gets one fixed slot in every document vector. 📖
Collect Every Unique Word
To build it, gather all words across your documents and keep only the distinct ones, dropping repeats.
words = set("the cat sat the mat".split())