There are 28 repositories under automatic-speech-recognition topic.
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
OpenAI Whisper ASR Webservice API
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
End-to-end ASR/LM implementation with PyTorch
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目,CPU上的实时率(RTF)小于0.1
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
The dataset of Speech Recognition
🔉 Youtube Videos Transcription with OpenAI's Whisper
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Mongolian speech recognition with PyTorch
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
A Pytorch implementation for the ZeroSpeech 2019 challenge.
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
VietASR - Vietnamese Automatic Speech Recognition