There are 9 repositories under language-identification topic.
A collection of sample apps to demonstrate how to use Google's ML Kit APIs on Android and iOS
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
Cross-platform speech toolset, used from the command-line or as a Node.js library. Includes a variety of engines for speech synthesis, speech recognition, forced alignment, speech translation, voice isolation, language detection and more.
Simple embedding based text classifier inspired by fastText, implemented in tensorflow
⚡️ 80x faster Fasttext language detection out of the box | Split text by language
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency
Code for the paper Language Identification Using Deep Convolutional Recurrent Neural Networks
Implementation of the paper "Spoken Language Recognition using X-vectors" in Pytorch
A TensorFlow-based spoken language identification
Fast and accurate natural language detection. Detector written in Javascript. Nito-ELD, ELD.
This code provides word level language identification tool for identifying language for individual words in Code-Mixed text. e.g. The text that includes words from two languages such as Hindi written in roman script, mixed with English.
End to End Dialect Identification using Convolutional Neural Network
✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux
Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.
fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-hant)
Babel Street Analytics Client Library for Python
Spoken Language Identification on Common Voice and AudioSet using Deep Learning
CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
Multi-Langauge Identification
Implement a GRU/LSTM model using Keras, and train it to classify the languages using MFCC features
Dataiku DSS plugin to detect languages, correct misspellings, and clean text data 🧼
Detect the languages from short pieces of text
Dataset for programming language identification.
Demo: Elasticsearch Language Identification