0PricingLogin
NLP Academy · Lesson

Counting With CountVectorizer

Turn documents into a count matrix.

Meet CountVectorizer

The CountVectorizer from scikit-learn builds your vocabulary and counts words for you in just a few lines. 🚀

from sklearn.feature_extraction.text import CountVectorizer

Create the Vectorizer

First make an instance. With no arguments it uses sensible defaults for tokenizing and lowercasing text.

vectorizer = CountVectorizer()

All lessons in this course

  1. Why Models Need Numbers, Not Words
  2. Building a Vocabulary
  3. Counting With CountVectorizer
  4. Reading the Document-Term Matrix
← Back to NLP Academy