Introduction
Named Entity Recognition also known as NER, is a Natural Language Processing (NLP) task that identifies and classifies named entities in a text. Named entities are real-world objects assigned a name. They include people's names, location names, work of art, organizations, days, dates and among many others.
Named Entity Recognition is usually used for extracting key information to understand a text while performing task such as topic identification. It can also be used on its own for the case of just extracting important information from a text.
In this article, I am going to explain how to perform Named Entity Recognition using Spacy.
Prerequisite
- Spacy installed
- Python installed
- Basic knowledge of python programming
What is Spacy?
Spacy is an open-source NLP library that is used for performing various NLP tasks.
It has a built-in mechanism that is used for identifying and classifying named entities.
NER using Spacy
First, let's import the Spacy library
import spacy
Then load the "en_core_web_sm" model and assign it to a variable named nlp
nlp = spacy.load("en_core_web_sm")
Let's create a sample text which we will extract named entities from
sample_text = "Over 200 youth from Kisumu County in Kenya, have today gotten a chance to take part in a Golf programme by Safaricom held at Lolwe Grounds."
Then create a Spacy document by passing the sample text into nlp()
doc = nlp(sample_text)
To extract the named entities from the document we will use '.ents'
print(doc.ents)
Output: (200, Kisumu County, Kenya, today, Safaricom, Lolwe Grounds)
Let's now print all the entities together with the category(label) they have been classified to.
for ent in doc.ents:
print(ent, ent.label_)
Output
200 CARDINAL
Kisumu County GPE
Kenya GPE
today DATE
Safaricom ORG
Lolwe Grounds FAC
The explain() method
Spacy has a method 'explain()', that a label/category can be passed to and it gives an explanation of that label/category.
To get a quick definition of a label, we can use the 'explain()' method.
Let's try it out with the labels we got
spacy.explain("CARDINAL")
Output: Numerals that do not fall under another type
spacy.explain("GPE")
Output: Countries, cities, states
spacy.explain("DATE")
Output: Absolute or relative dates or periods
spacy.explain("FAC")
Output: Buildings, airports, highways, bridges, etc.
Visualizing Named Entities using Displacy
Displacy is a built-in Spacy dependency visualizer.
It will show the Named Entities directly in the text.
Let's import Displacy
from spacy import displacy
Then, we will create the visual
displacy.render(doc,style="ent",jupyter=True)
Output
Conclusion
Named Entity Recognition is one of the methods that can be used to gain insights from a text while carrying out NLP tasks. Named Entity Recognition has several use cases such as in Recommendation systems, enabling efficient search algorithms, customer support and so on.
In this article, we looked at Named Entity Recognition using Spacy. But, Spacy is not the only library that can be used for NER. Other open-source libraries that you can use are NLTK and Stanford NER
Top comments (0)