Vishay Raina's repositories
Bag-of-Visual-Words
This has he BoVW model to classify the images of same object together among: airplanes, bikes, cars, faces.
arabic_pronounce
Pronounce Arabic words
asr_labs
ASR labs
Best-README-Template
An awesome README template to jumpstart your projects!
camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
ctcdecode
PyTorch CTC Decoder bindings
da-lang-id
Domain Adaptation for Spoken Language ID
demo
example code for remind myself, especial the api
Digit-Recognition
A CNN LeNet model to classify images of digits as 0 - 9.
E2E-ASR
PyTorch Implementations for End-to-End Automatic Speech Recognition
EEND
End-to-End Neural Diarization
kaldi
This is the official location of the Kaldi project.
marytts-lexicon-de
German lexicon for MaryTTS
neural_sp
End-to-end ASR/LM implementation with PyTorch
pika
a lightweight speech processing toolkit based on Pytorch and (Py)Kaldi
speech-training-recorder
Simple GUI application to help record audio dictated from given text prompts, for use with training speech recognition or speech synthesis.
spoteno
Spoken text normalization for asr
triplet-entropy-loss
Project repository for the work done in Triplet Entropy Loss: Improving The Generalization of Short Speech Language Identification Systems
Tuplemax-Loss
Unofficial implementation of pairwise tuplemax loss. TUPLEMAX LOSS FOR LANGUAGE IDENTIFICATION https://arxiv.org/pdf/1811.12290.pdf Eq. (2). works only for batch_size = 1
UHV-OTS-Speech
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
VGG-Speaker-Recognition
Utterance-level Aggregation For Speaker Recognition In The Wild
wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
youtube-dl
Command-line program to download videos from YouTube.com and other video sites