There are 204 repositories under speech-recognition topic.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Port of OpenAI's Whisper model in C/C++
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
Faster Whisper transcription with CTranslate2
Speech recognition module for Python, supporting several engines and APIs, online and offline.
A PyTorch-based Speech Toolkit
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Facebook AI Research's Automatic Speech Recognition Toolkit
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
Machine Learning Resources, Practice and Research
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models.
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Lingvo
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Actionable AI SDK for iOS to enable text and voice conversations with actions (Swift, Objective-C)
Swift native on-device speech recognition with Whisper for Apple Silicon