To split any text into sentences using winkNLP, read the text using readDoc
. Then use the sentences
method to get a collection of sentences from the text. Follow this with the out
method to get this collection as a JavaScript array. This is how you can split a text into sentences:
// Load wink-nlp package & helpers.
const winkNLP = require( 'wink-nlp' );
// Load "its" helper to extract item properties.
const its = require( 'wink-nlp/src/its.js' );
// Load english language model — light version.
const model = require( 'wink-eng-lite-model' );
// Instantiate winkNLP.
const nlp = winkNLP( model );
// Input text
const text = 'AI Inc. is focussing on AI. It is based in
the U.S.A. It was started on 06.12.2007.';
// Read text
const doc = nlp.readDoc( text );
// Extract sentences from the data
const sentences = doc.sentences().out();
console.log( sentences );
This returns an array of sentences:
[
'AI Inc. is focussing on AI.',
'It is based in the U.S.A.',
'It was started on 06.12.2007.'
]
If no sentence-break is found in the input text, the output is the complete text as an array with a single member.
A sentence is usually split at a full-stop, question mark or exclamation mark. Even in the presence of abbreviations, honorifics, etc., winkNLP attempts to intelligently identify the sentence boundary.
Top comments (0)