automatic-speech-recognition

There are 28 repositories under automatic-speech-recognition topic.

wenet-e2e / wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
e2e-models pytorch asr transformer conformer production-ready automatic-speech-recognition speech-recognition whisper
Language:Python 3951
zzw922cn / awesome-speech-recognition-speech-synthesis-papers
Automatic Speech Recognition (ASR), Speaker Verification, Speech Synthesis, Text-to-Speech (TTS), Language Modelling, Singing Voice Synthesis (SVS), Voice Conversion (VC)
automatic-speech-recognition papers roadmap rnn cnn dnn attention-mechanism seq2seq acoustic-model timit-dataset tts language-model speaker-verification speech-recognition speech-synthesis neural-network recognition-synthesis diffusion-models singing-voice-synthesis voice-conversion
2919
zzw922cn / Automatic_Speech_Recognition
End-to-end Automatic Speech Recognition for Madarian and English in Tensorflow
automatic-speech-recognition tensorflow timit-dataset feature-vector phonemes data-preprocessing rnn audio deep-learning lstm end-to-end cnn rnn-encoder-decoder evaluation paper speech-recognition layer-normalization chinese-speech-recognition
Language:Python 2844
coqui-ai / STT
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
stt speech-to-text tensorflow deep-learning automatic-speech-recognition asr voice-recognition speech-recognition speech-recognizer speech-recognition-api
Language:C++ 2207
ahmetoner / whisper-asr-webservice
OpenAI Whisper ASR Webservice API
asr automatic-speech-recognition docker openai-whisper speech speech-recognition speech-to-text
Language:Python 1897
kakaobrain / pororo
PORORO: Platform Of neuRal mOdels for natuRal language prOcessing
automatic-speech-recognition deep-learning natural-language-processing neural-models speech-synthesis
Language:Python 1268
TensorSpeech / TensorFlowASR
:zap: TensorFlowASR: Almost State-of-the-art Automatic Speech Recognition in Tensorflow 2. Supported languages that can use characters or subwords
automatic-speech-recognition conformer contextnet ctc deepspeech2 end2end jasper rnn-transducer speech-recognition speech-to-text streaming-transducer subword-speech-recognition tensorflow tensorflow2 tflite tflite-convertion tflite-model
Language:Python 917
snakers4 / open_stt
Open STT
speech-to-text russian dataset stt asr automatic-speech-recognition
Language:Python 775
shirayu / whispering
Streaming transcriber with whisper
automatic-speech-recognition whisper
Language:Python 686
hirofumi0810 / neural_sp
End-to-end ASR/LM implementation with PyTorch
pytorch speech-recognition automatic-speech-recognition asr ctc attention-mechanism attention seq2seq sequence-to-sequence speech language-model transformer language-modeling rnn-transducer transformer-xl streaming
Language:Python 589
jitsi / jiwer
Evaluate your speech-to-text system with similarity measures such as word error rate (WER)
wer automatic-speech-recognition python3 speech-to-text evaluation-metrics word-error-rate
Language:Python 575
Picovoice / cheetah
On-device streaming speech-to-text engine powered by deep learning
speech-to-text asr automatic-speech-recognition online-speech-recognition speech-recognition stt transcription voice-recognition streaming-speech-to-text
Language:Python 574
YoavRamon / awesome-kaldi
This is a list of features, scripts, blogs and resources for better using Kaldi ( http://kaldi-asr.org/ )
automatic-speech-recognition awesome-list kaldi kaldi-asr speech speech-recognition speech-to-text
533
awesome-large-audio-models
EmulationAI / awesome-large-audio-models
Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.
audio-ai audio-processing automatic-speech-recognition foundational-models large-audio-models large-language-model-speech large-language-models music-ai music-information-retrieval music-processing speech-ai speech-llms speech-to-text
472
Z-yq / TensorflowASR
一个执着于让CPU\端侧-Model逼近GPU-Model性能的项目，CPU上的实时率(RTF)小于0.1
transformer bert tensorflow2 automatic-speech-recognition state-of-the-art ctc listen-attend-and-spell transducers cpp tensorflow-cpp
Language:Python 459
jonatasgrosman / huggingsound
HuggingSound: A toolkit for speech-related tasks based on Hugging Face's tools
transformers audio speech speech-recognition asr automatic-speech-recognition speech-to-text
Language:Python 429
Picovoice / leopard
On-device speech-to-text engine powered by deep learning
stt speech-to-text asr automatic-speech-recognition on-device speech-recognition transcription voice-recognition voice-to-text
Language:Python 419
double22a / speech_dataset
The dataset of Speech Recognition
asr audio automatic-speech-recognition dataset deep-learning deep-neural-networks speech speech-diarization speech-enhancement speech-recognition speech-segmentation speech-separation speech-synthesis speech-to-text speech-translation text-to-speech tts voice-conversion wav
374
ArthurFDLR / whisper-youtube
🔉 Youtube Videos Transcription with OpenAI's Whisper
automatic-speech-recognition speech-recognition speech-to-text transformer whisper youtube colab-notebook
Language:Jupyter Notebook 342
hirofumi0810 / tensorflow_end2end_speech_recognition
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
speech-recognition ctc tensorflow timit csj timit-dataset attention-mechanism automatic-speech-recognition asr librispeech end-to-end end-to-end-learning speech-to-text joint-ctc-attention beam-search
Language:Python 314
m3hrdadfi / soxan
Wav2Vec for speech recognition, classification, and audio classification
speech-emotion-recognition emotion-recognition automatic-speech-recognition speech-recognition speech-classification
Language:Jupyter Notebook 234
rolczynski / Automatic-Speech-Recognition
🎧 Automatic Speech Recognition: DeepSpeech & Seq2Seq (TensorFlow)
deep-learning machine-learning neural-networks keras speech-recognition speech-to-text deepspeech language-model distill tensorflow tensorflow-models automatic-speech-recognition
Language:Python 222
smeetrs / deep_avsr
A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.
audio-visual-speech-recognition speech-recognition lip-reading automatic-speech-recognition speech-to-text visual-speech-recognition
Language:Python 197
bricewalker / Hey-Jetson
Deep Learning based Automatic Speech Recognition with attention for the Nvidia Jetson.
python flask rest-api html css recurrent-neural-networks deep-learning speech-recognition speech-to-text sentiment-analysis deep-neural-networks inference keras tensorflow azure azure-cognitive-services nvidia-jetson-tx2 jetson-tx2 attention automatic-speech-recognition
Language:Jupyter Notebook 192
vilassn / whisper_android
Offline Speech Recognition with OpenAI Whisper and TensorFlow Lite for Android
asr openai texttospeech tts whisper text-to-speech speech-recognition tensorflow tflite offline tensorflowlite android automatic-speech-recognition transcription transcribe embedded mobile
Language:C++ 171
sovaai / sova-asr
SOVA ASR (Automatic Speech Recognition)
asr asr-model stt speech-recognition speech-to-text speech wav2letter automatic-speech-recognition
Language:Python 168
elpis
CoEDL / elpis
🙊 software for creating speech recognition models.
kaldi transcription computational-linguistics linguistics automatic-speech-recognition python docker
Language:Python 151
anton-jeran / FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
room-impulse-response automatic-speech-recognition acoustics conditional-generation generative-adversarial-network speech augmentation synthetic-data rir neural-network diffuse-scattering deep-learning impulse-response implicit-neural-representation
Language:Python 143
tugstugi / mongolian-speech-recognition
Mongolian speech recognition with PyTorch
pytorch speech-recognition speech-to-text mongolian convolutional-neural-networks deep-learning python asr automatic-speech-recognition
Language:Python 129
at16k / at16k
Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.
speech-recognition speech-to-text speech-api speech-recognizer speech-analysis voice-recognition voice-commands automatic-speech-recognition asr asr-model pretrained-models
Language:Python 128
noco-ai / spellbook-docker
AI stack for interacting with LLMs, Stable Diffusion, Whisper, xTTS and many other AI models
llama2 llm-inference mixtral stable-diffusion automatic-speech-recognition text-to-speech bark musicgeneration whisper xttsv2
Language:Shell 127
NavodPeiris / speechlib
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
ai automatic-speech-recognition faster-whisper speaker-diarization speaker-recognition speaker-verification transcription whisper-ai
Language:Python 115
kmario23 / KenLM-training
Training an n-gram based Language Model using KenLM toolkit for Deep Speech 2
natural-language-processing language-modeling automatic-speech-recognition deep-neural-networks kenlm-toolkit kenlm language-model probabilistic-models deep-speech speech-recognition python
112
ZeroSpeech-TTS-without-T
andi611 / ZeroSpeech-TTS-without-T
A Pytorch implementation for the ZeroSpeech 2019 challenge.
zerospeech tts-without-t text-to-speech automatic-speech-recognition tts asr autoencoder gan adversarial-learning
Language:Python 110
ieasybooks / tafrigh
تفريغ النصوص وإنشاء ملفات SRT و VTT باستخدام نماذج Whisper وتقنية wit.ai.
asr automatic-speech-recognition ctranslate2 facebook faster-whisper python soundcloud srt stable-whisper subtitles twitter vtt whisper youtube
Language:Python 95
dangvansam / viet-asr
VietASR - Vietnamese Automatic Speech Recognition
speech-recognition automatic-speech-recognition asr vietnamese-speech-recognition vietnamese-nlp vietnamese-language ctc-loss ctc-decode speech-to-text stt
Language:Python 94