There are 7 repositories under language-identification topic.
A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Simple embedding based text classifier inspired by fastText, implemented in tensorflow
Easy-to-use speech toolset. Written in TypeScript. Includes tools for synthesis, recognition, alignment, speech translation, language detection, source separation and more.
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks
Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch
⚡️ 80x faster language detection with Fasttext | Split text by language for TTS
GlotLID: Language Identification with Support for More Than 2000 Labels -- EMNLP 2023
A TensorFlow-based spoken language identification
End to End Dialect Identification using Convolutional Neural Network
This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.
Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.
Rosette API Client Library for Python
fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
Spoken Language Identification on Common Voice and AudioSet using Deep Learning
CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
Multi-Langauge Identification
Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼
Fast and accurate natural language detection. Detector written in Javascript. Nito-ELD, ELD.
Detect the languages from short pieces of text
Dataset for programming language identification.
Demo: Elasticsearch Language Identification
Language Identification Toolkit