axa-group / nlp.js

An NLP library for building bots, with entity extraction, sentiment analysis, automatic language identify, and so more

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Language guess mistakes english for catalan

jcalve opened this issue · comments

Describe the bug
The Language.guess() function mistakes a short english sentence for catalan

To Reproduce
1 - Run this script:

import { Language } from "@nlpjs/language"

const lang = new Language();
const text = 'What is your name?'
console.log(text, lang.guess(text, ['es', 'en', 'ca']))

Output

What is your name? [
  { alpha3: 'cat', alpha2: 'ca', language: 'Catalan', score: 1 },
  {
    alpha3: 'eng',
    alpha2: 'en',
    language: 'English',
    score: 0.9702093397745571
  },
  {
    alpha3: 'spa',
    alpha2: 'es',
    language: 'Spanish',
    score: 0.7093397745571659
  }
]

Desktop (please complete the following information):

  • OS: Windows
  • Package version: 4.26.1
  • Node: 16