shaun's repositories
audio-diffusion
Apply Denoising Diffusion Probabilistic Models using the new Hugging Face diffusers package to synthesize music instead of images.
VITS_gpt_llama
the code for llama in the vitsGPT project
voicefixer_main
General Speech Restoration
Bert-VITS2
vits2 backbone with bert
Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
diffsptk
A differential version of SPTK
FAcodec
Training code for FAcodec presented in NaturalSpeech3
fregrad
Code repository for FreGrad
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
larynx2_vits_TTS_cpp_onnx
A fast, local neural text to speech system
llama3
The official Meta Llama 3 GitHub site
llm.c
LLM training in simple, raw C/CUDA
metavoice-src
AI for human-level speech intelligence
MiniGemini
Official implementation for Mini-Gemini
open-unmix-pytorch
Open-Unmix - Music Source Separation for PyTorch
phase_augmentation_one_to_many
Submitted to ICASSP 2023
seamless_communication
Foundational Models for State-of-the-Art Speech and Text Translation
sgmse_Speech-Enhancement-and-Dereverberation-with-Diffusion-based-Generative-Models
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
snake
SNAKE Inspired by "Neural Networks Fail to Learn Periodic Functions and How to Fix It"
stable-audio-tools
Generative models for conditional audio generation
stable-speech
Reproduction of Stability AI's Text-to-Speech model.
storm
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
vector-quantize-pytorch
Vector Quantization, in Pytorch
visqol
Perceptual Quality Estimator for speech and audio
vit-pytorch
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
vitsgpt-vits
the code for vits in the vitsGPT project
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
WavCraft
Official repo for WavCraft, an AI agent for audio creation and editing