ZaDarkSide / cld3-asm

WebAssembly based Javascript bindings for google Compact Language Detector v3

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build Status Build status codecov npm node

cld3-asm

cld3-asm is isomorphic javascript binding to google's compact language detector v3 based on WebAssembly cld3 binary. This module aims to provide thin, lightweight interface to cld3 without requiring native modules.

Install

npm install cld3-asm

Usage

Loading module asynchronously

cld3-asm relies on wasm binary of cld3, which need to be initialized first.

import { loadModule } from 'cld3-asm';

const cldFactory = await loadModule();

loadModule loads wasm binary, initialize it, and returns factory function to create instance of cld3 language identifier.

loadModule({ timeout?: number }): Promise<CldFactory>

It allows to specify timeout to wait until wasm binary compliation & load.

Creating language identifier

create(minBytes?: number, maxBytes?: number): LanguageIdentifier

LanguageIdentifier exposes minimal interfaces to cld3's NNetLanguageIdentifier.

  • findLanguage(text: string): Readonly<LanguageResult> : Finds the most likely language for the given text.
  • findMostFrequentLanguages(text: string, numLangs: number): Array<Readonly<LanguageResult>> : Splits the input text into spans based on the script, predicts a language for each span, and returns a vector storing the top num_langs most frequent languages
  • dispose(): void : Destroy current instance of language identifier. It is important to note created instance will not be destroyed automatically.

There are simple examples for each environments. In each example directory do npm install && npm start.

Building / Testing

Few npm scripts are supported for build / test code.

  • build: Transpiles code to ES5 commonjs to dist.
  • test: Run cld / cld3-asm test both. Does not require build before execute test.
  • test:cld: Run integration test for actual cld3 wasm binary, using cld's test case.
  • test:cld3-asm: Run unit test against cld3-asm interface
  • lint: Run lint over all codebases
  • lint:staged: Run lint only for staged changes. This'll be executed automatically with precommit hook.
  • commit: Commit wizard to write commit message

License

About

WebAssembly based Javascript bindings for google Compact Language Detector v3

License:MIT License


Languages

Language:TypeScript 96.9%Language:JavaScript 3.1%