komodojp / tinyld

Simple and Performant Language detection library for NodeJS

Home Page:https://komodojp.github.io/tinyld/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Not detecting this Chinese text: 地图箭头方向与实际情况相反

lucasrmendonca opened this issue · comments

Using tinyld version 1.3.4

Steps to reproduce:

import { detect } from 'tinyld';

const detectedLanguage = detect("地图箭头方向与实际情况相反");
console.log(detectedLanguage)

Output

''

Expected output:

zh

Looking at the Playground (https://komodojp.github.io/tinyld/), only the heavy version recognises it.

Correct me if I'm wrong here, but shouldn't the presence of Asian unicode characters in the string be enough for tinyld to at least guess it must be one of the asian languages?

I'm not sure how it works under the hood, but it feels strange that it requires the heavy version for this