There are 3 repositories under asr-model topic.
An implementation of RNN-Transducer loss in TF-2.0.
fine-tune Wav2vec2. an ASR model released by Facebook
Summarization, topic generation using GPT3
Quartznet implementation on pytorch [https://arxiv.org/abs/1910.10261]
Collection and resources for Bulgarian Corpus, Datasets and Models used in ASR, TTS or NLP tasks together with the links of corresponding tools/apps.
Many ASRs under one roof. With Benchmarking... answering the question. What is the best ASR for my dataset?
Deepspeech ASR Model for the Catalan Language
Automatic speech recognition (ASR) for Indonesian language built by using HTK and Julius. Web interface is built using Node.js.
Audio Preprocessing and finetuning of wav2vec2-large-xlsr model on AI4D Baamtu Datamation - Automatic Speech Recognition in WOLOF Data.
Automatic Speech Recognition (ASR) model QuartzNet trained on English CommonVoice. In PyTroch with CTC loss and beam search.
An end to end ASR Transformer model training repo
A benchmark of speech recognition solutions for the Catalan language
Audio to Audio (Whisper+ChatGPT+Bark)
Implementation of the paper "Listen, Attend and Spell" Paper in Pytorch
Create and adapt n-gram and JSGF language models, e.g. for Kaldi-ASR nnet3 chain models from Zamia-Speech
QuartzNet implementation for Automatic Speech Recognition task
Build end-to-end Deep Neural Network to translate speech to text (ASR model)
The dataset of Sichuan dialect conversational speech
Mega Conversational Speech Datasets for Speech Recognition
ASR Web APP 中文语音识别实验室APP,使用Django构建,包含中文语音转文字与中文语音聊天机器人模块
Ноутбук для тонкой настройки Whisper на наборе данных Mozilla Сommon Voice.
A simple CRDNN based ASR model for my own understanding of how ASR works and are trained. (Work in progress) If anyone finds any error or have any suggestion please do let me know.
End-to-End Automatic Speech Recognition on PyTorch with CTC Decoder and Ken LM
American English Conversational Speech Dataset
Chinese Wake-up Words Speech Dataset
Hindi Speech Dataset
Thai Speech Dataset
Mixed Speech with Korean and English Dataset
Indonesian Speech Dataset
The dataset of Henan Dialect conversational speech
How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi
SpeechKit Asynchronous Batch Recognizer.