saurabhvyas

Saurabh Vyas's starred repositories

Real-Time-Voice-Cloning

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Language:PythonNOASSERTION55519 943 1101

bert

TensorFlow code and pre-trained models for BERT

Language:PythonApache-2.039511 997 1144

NLP-progress

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Language:PythonMIT22943 1263 101

libfacedetection

An open source library for face detection in images. The face detection speed can reach 1000FPS.

Language:C++NOASSERTION12622 532 321

speech_recognition

Speech recognition module for Python, supporting several engines and APIs, online and offline.

Language:PythonBSD-3-Clause8861 274 632

ffsubsync

Automagically synchronize subtitles with video.

Language:PythonMIT7342 77 161

DeepPavlov

An open source library for deep learning end-to-end dialog systems and chatbots.

Language:PythonApache-2.06930 207 643

BERT-pytorch

Google AI 2018 BERT pytorch implementation

Language:PythonApache-2.06447 124 88

yolact

A simple, fully convolutional model for real-time instance segmentation.

Language:PythonMIT5171 103 795

waveglow

A Flow-based Generative Network for Speech Synthesis

Language:PythonBSD-3-Clause2333 76 257

open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

MIT1357 56 199

kaldi-gstreamer-server

Real-time full-duplex speech recognition server, based on the Kaldi toolkit and the GStreamer framwork.

Language:PythonBSD-2-Clause1087 68 222

espresso

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Language:PythonNOASSERTION943 43 54

mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data

Language:Jupyter NotebookBSD-3-Clause859 28 96

FloWaveNet

A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"

Language:PythonMIT491 41 21

zamia-speech

Open tools and data for cloudless automatic speech recognition

Language:PythonLGPL-3.0447 37 90

obamanet

ObamaNet : Photo-realistic lip-sync from audio (Unofficial port)

Language:PythonMIT238 13 27

kaldi-dnn-ali-gop

Forced alignment and Goodness of Pronunciation (GOP) with DNN support. Bases on Kaldi.

Language:C++NOASSERTION228 150

pykaldi2

Yet another speech toolkit based on Kaldi and PyTorch

Language:PythonMIT174 12 14

speech_separation

Include some core functions and model to handle speech separation

Language:PythonMIT155 11 29

KTSpeechCrawler

Automatically constructing corpus for automatic speech recognition from YouTube videos

Language:PythonMIT154 16 4

audiomate

Python library for handling audio datasets.

Language:PythonMIT137 11 81

idlak

Official home of the Idlak Speech Synthesis Toolkit

Language:ShellNOASSERTION66 11 31

ventib

:chart_with_upwards_trend: Ventib records your voice, transcribes it in realtime, and performs speech pattern analysis to give you objective statistics about how you speak.

Language:JavaScript46 8 1