Tokens, Spans, and Doc Objects
Navigate spaCy's data structures.
Three Core Objects
spaCy gives you three building blocks: the Doc, the Token, and the Span. Learn these and the whole library opens up.
The Doc Container
A Doc is the full processed document. It holds the original text plus every token and all the analysis spaCy produced.
doc = nlp("Maria leads the data team.")All lessons in this course
- Why spaCy for Real Projects
- Loading a Model and Processing a Doc
- Tokens, Spans, and Doc Objects
- Customizing the Pipeline