If sentence is all uppercase, it gives wrong results.
JaViLuMa opened this issue · comments
Hello. I had a task to detect languages for certain sentences.
Let's say we have this sentence:
ZANIMA ME CENA PREMIUM HIŠIC, BLIZU MORJA, IMAMO TUDI PSA. this is the output:
But if I convert it to sentence case (Zanima me cena hišic, blizu morja, imamo tudi psa.), output is MUCH different:
I know this issue is easy to fix, but I think this behavior is and was not intended.
Has anyone done anything better than: detect(TEXT_with_Capital_Letters.lower())
?
I think it will almost never degrade accuracy if we make the string lower-case before feeding it into the algorithm.