Show HN: WinkNLP delivers 600k tokens/second speed on browsers (MBP M1) by sanjayaksaxena3

Share This Article

Sed ut perspiciatis unde.

Developer friendly Natural Language Processing ✨

WinkNLP is a JavaScript library for Natural Language Processing (NLP). Designed specifically to make development of NLP applications easier and faster, winkNLP is optimized for the right balance of performance and accuracy.

It is built ground up with a lean code base that has no external dependency. A test coverage of ~100% and compliance with the Open Source Security Foundation best practices make winkNLP the ideal tool for building production grade systems with confidence.

WinkNLP with full Typescript support, runs on Node.js and browsers.

Build amazing apps quickly

Head to live examples to explore further.

Blazing fast

WinkNLP can easily process large amount of raw text at speeds over 650,000 tokens/second on a M1 Macbook Pro in both browser and Node.js environments. It even runs smoothly on a low-end smartphone’s browser.

Features

WinkNLP has a comprehensive natural language processing (NLP) pipeline covering tokenization, sentence boundary detection (sbd), negation handling, sentiment analysis, part-of-speech (pos) tagging, named entity recognition (ner), custom entities recognition (cer). It offers a rich feature set:

🐎 Fast, lossless & multilingual tokenizer	For example, the multilingual text string `"¡Hola! नमस्कार! Hi! Bonjour chéri"` is tokenized as `["¡", "Hola", "!", "नमस्कार", "!", "Hi", "!", "Bonjour", "chéri"]`. The tokenizer processes text at a speed close to 4 million tokens/second on a M1 MBP’s browser.
✨ Developer friendly and intuitive API	With winkNLP, process any text using a simple, declarative syntax; most live examples have 30-40 lines of code.
🖼 Best-in-class text visualization	Programmatically mark tokens, sentences, entities, etc. using HTML mark or any other tag of your choice.
♻️ Extensive text processing features	Remove and/or retain tokens with specific attributes such as part-of-speech, named entity type, token type, stop word, shape and many more; compute Flesch reading ease score; generate n-grams; normalize, lemmatise or stem. Checkout how with the right kind of text preprocessing, even Naive Bayes classifier achieves impressive (≥90%) accuracy in sentiment analysis and chatbot intent classification tasks.
🔠 Pre-trained language models	Compact sizes starting from <3MB – reduced model loading time drastically.
💼 Host of utilities & tools	BM25 vectorizer; Several similarity methods – Cosine, Tversky, Sørensen-Dice, Otsuka-Ochiai; Helpers to get bag of words, frequency table, lemma/stem, stop word removal and many more.

WinkJS also has packages like Naive Bayes classifier, multi-class averaged perceptron and popular token and string distance methods, which complement winkNLP.

Documentation

Concepts — everything you need to know to get started.
API Reference — explains usage of APIs with examples.
Change log — version history along with the details of breaking changes, if any.
Examples — live examples with code to give you a head start.

Installation

Use npm install:

npm install wink-nlp --save

In order to use winkNLP after its installation, you also need to install a language model according to the node version used. The table below outlines the version specific installation command:

Node.js Version	Installation
16 or 18	`npm install wink-eng-lite-web-model --save`
14 or 12	`node -e "require('wink-nlp/models/install')"`

The wink-eng-lite-web-model is designed to work with Node.js version 16 or 18. It can also work on browsers as described in the next section. This is the recommended model.

The second command installs the wink-eng-lite-model, which works with Node.js version 14 or 12.

How to install for Web Browser

If you’re using winkNLP in the browser use the wink-eng-lite-web-model. Learn about its installation and usage in our guide to using winkNLP in the browser. Explore winkNLP recipes on Observable for live browser based examples.

Get started

Here is the “Hello World!” of winkNLP:

Hello World🌎! How are you?

console.log( doc.sentences().out() );
// -> [ ‘Hello World🌎!’, ‘How are you?’ ]

console.log( doc.entities().out( its.detail ) );
// -> [ { value: ‘🌎’, type: ‘EMOJI’ } ]

console.log( doc.tokens().out() );
// -> [ ‘Hello’, ‘World’, ‘🌎’, ‘!’, ‘How’, ‘are’, ‘you’, ‘?’ ]

console.log( doc.tokens().out( its.type, as.freqTable ) );
// -> [ [ ‘word’, 5 ], [ ‘punctuation’, 2 ], [ ’emoji’, 1 ] ]” dir=”auto”>

// Load wink-nlp package.
const winkNLP = require( 'wink-nlp' );
// Load english language model.
const model = require( 'wink-eng-lite-web-model' );
// Instantiate winkNLP.
const nlp = winkNLP( model );
// Obtain "its" helper to extract item properties.
const its = nlp.its;
// Obtain "as" reducer helper to reduce a collection.
const as = nlp.as;
 
// NLP Code.
const text = 'Hello   World🌎! How are you?';
const doc = nlp.readDoc( text );
 
console.log( doc.out() );
// -> Hello   World🌎! How are you?
 
console.log( doc.sentences().out() );
// -> [ 'Hello   World🌎!', 'How are you?' ]
 
console.log( doc.entities().out( its.detail ) );
// -> [ { value: '🌎', type: 'EMOJI' } ]
 
console.log( doc.tokens().out() );
// -> [ 'Hello', 'World', '🌎', '!', 'How', 'are', 'you', '?' ]
 
console.log( doc.tokens().out( its.type, as.freqTable ) );
// -> [ [ 'word', 5 ], [ 'punctuation', 2 ], [ 'emoji', 1 ] ]

Experiment with winkNLP on RunKit.

Speed & Accuracy

The winkNLP processes raw text at ~650,000 tokens per second with its wink-eng-lite-web-model, when benchmarked using “Ch 13 of Ulysses by James Joyce” on a M1 Macbook Pro machine with 16GB

Show HN: WinkNLP delivers 600k tokens/second speed on browsers (MBP M1) by sanjayaksaxena3

Show HN: WinkNLP delivers 600k tokens/second speed on browsers (MBP M1) by sanjayaksaxena3

Share This Article

Newsletter

Developer friendly Natural Language Processing ✨

Build amazing apps quickly

Blazing fast

Features

Documentation

Installation

How to install for Web Browser

Get started

Speed & Accuracy

HackTech

Leave a comment Cancel reply

Editor's Choice

Show HN: WinkNLP delivers 600k tokens/second speed on browsers (MBP M1) by sanjayaksaxena3

Show HN: WinkNLP delivers 600k tokens/second speed on browsers (MBP M1) by sanjayaksaxena3

Share This Article

Newsletter

Developer friendly Natural Language Processing ✨

Build amazing apps quickly

Blazing fast

Features

Documentation

Installation

How to install for Web Browser

Get started

Speed & Accuracy

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter