Entity Extraction for Knowledge Graphs
Named entity recognition, relation extraction, and graph population.
What Is Entity Extraction?
Entity extraction (Named Entity Recognition, NER) identifies named things in text: people, organizations, locations, dates, and more. It is the first step in building a knowledge graph from unstructured text.
spaCy NER Basics
spaCy's en_core_web_sm model recognizes standard entity types: PERSON, ORG, GPE (geopolitical entity), DATE, MONEY, and more.
import spacy
# Load English model (install: python -m spacy download en_core_web_sm)
nlp = spacy.load('en_core_web_sm')
text = 'Elon Musk founded SpaceX in 2002 in Hawthorne, California. Tesla is headquartered in Austin, Texas.'
doc = nlp(text)
for ent in doc.ents:
print(f'{ent.text:30} {ent.label_:15} {spacy.explain(ent.label_)}')
# Output:
# Elon Musk PERSON People, including fictional
# SpaceX ORG Companies, agencies...
# 2002 DATE Absolute or relative dates
# Hawthorne, California GPE Countries, cities, statesAll lessons in this course
- Entity Extraction for Knowledge Graphs
- Neo4j Queries from Agent Tools
- Combining Vector and Graph Retrieval
- Building a Knowledge-Augmented Agent