There are 150 repositories under speech-recognition topic.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Port of OpenAI's Whisper model in C/C++
Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures!!
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
Speech recognition module for Python, supporting several engines and APIs, online and offline.
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
A Deep-Learning-Based Chinese Speech Recognition System 基于深度学习的中文语音识别系统
Facebook AI Research's Automatic Speech Recognition Toolkit
A PyTorch-based Speech Toolkit
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
Lingvo
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
Machine Learning Resources, Practice and Research
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
In-App assistant SDK to build a multimodal conversational UX for websites and web apps (JavaScript, React, Angular, Vue, Ember, Electron)
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
:unlock: Lip Reading - Cross Audio-Visual Recognition using 3D Architectures
Alias is a teachable “parasite” that is designed to give users more control over their smart assistants, both when it comes to customisation and privacy. Through a simple app the user can train Alias to react on a custom wake-word/sound, and once trained, Alias can take control over your home assistant by activating it for you.
Kalliope is a framework that will help you to create your own personal assistant.
Open-Source Large Vocabulary Continuous Speech Recognition Engine