komodojp / tinyld

Simple and Performant Language detection library for NodeJS

Home Page:https://komodojp.github.io/tinyld/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve for long text/document

kefniark opened this issue · comments

Description

  • Split long text into smaller chunks (by punctuation .,;:()
  • Only process a max number of chunk
  • Merge result before picking best guess

with 1.1.0:

  • text are automatically split in multiple chunk
  • for long document, only few chunks are evaluated, not the whole text
  • results are merged