caiselvass / language-identification

An NLP project leveraging character trigrams and smoothing techniques (Lidstone, Linear Discounting, Absolute Discounting) for language identification. Trained on for Spanish, Italian, English, French, Dutch, and German, achieving 99.8932% accuracy. Includes datasets, model parameters, and comprehensive documentation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

caiselvass/language-identification Issues