Reading the Document-Term Matrix
Understand rows, columns, and sparsity.
What Is the DTM?
A document-term matrix is a grid of counts. Each row is a document and each column is a word from your vocabulary. 🧮
Rows Are Documents
One row holds the full count vector for a single document, summarizing how many times each word appeared in it.
All lessons in this course
- Why Models Need Numbers, Not Words
- Building a Vocabulary
- Counting With CountVectorizer
- Reading the Document-Term Matrix