What Is a Token, Really?
Words, punctuation, and the unit of NLP.
Meet the Token
A token is the smallest unit of text your code works with. Before any analysis, you chop a sentence into these tiny pieces. 🧩
Usually a Word
Most of the time a token is just a word. The sentence I love cats becomes three tokens: I, love, and cats.
All lessons in this course
- What Is a Token, Really?
- Splitting on Whitespace and Its Limits
- Sentence Segmentation Basics
- Tokenizing With NLTK