Rishikesh (ऋषिकेश)'s repositories
ViViT-pytorch
Implementation of ViViT: A Video Vision Transformer
iSTFTNet-pytorch
iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform
FastSpeech2
PyTorch Implementation of FastSpeech 2 : Fast and High-Quality End-to-End Text to Speech
HiFiplusplus-pytorch
HiFi++: a Unified Framework for Neural Vocoding, Bandwidth Extension and Speech Enhancement
SoundStorm-pytorch
Google's SoundStorm: Efficient Parallel Audio Generation
Avocodo-pytorch
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Fre-GAN-pytorch
Fre-GAN: Adversarial Frequency-consistent Audio Synthesis
vae_tacotron2
VAE Tacotron 2, an alternative of GST Tacotron
LightSpeech
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search
UnivNet-pytorch
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
AdaSpeech2
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data
AudioMAE-pytorch
Unofficial PyTorch implementation of Masked Autoencoders that Listen
gmvae_tacotron
Gaussian Mixture VAE Tacotron
Liveness-Detection
Liveness Detection for human face
iSTFT-Avocodo-pytorch
Ultrafast GAN based Vocoder for Text to Speech
Phone-Level-Mixture-Density-Network-for-TTS
Rich Prosody Diversity Modelling with Phone-level Mixture Density Network
Zero-Shot-TTS
Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration
NU-Wave2-pytorch
NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates [WIP]
Bidirectional-LEM-pytorch
Pytorch Implementation of Bidirectional Long Expressive Memory
ai-audio-startups
Community list of startups working with AI in audio and music technology
Inception-Transformer-pytorch
iFormer: Inception Transformer
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
PL-BERT
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions