Konstantine Goudz's starred repositories
Retrieval-based-Voice-Conversion-WebUI
Easily train a good VC model with voice data <= 10 mins!
ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
open_flamingo
An open-source framework for training large multimodal models.
audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
AVeryComfyNerd
ComfyUI related stuff and things
HierSpeechpp
The official implementation of HierSpeech++
bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
HD-Painter
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models
spear-tts-pytorch
Implementation of Spear-TTS - multi-speaker text-to-speech attention network, in Pytorch
EfficientWord-Net
OneShot Learning-based hotword detection.
CoMoSpeech
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Bridge-TTS
Official codebase for "Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis" (https://arxiv.org/abs/2312.03491).
UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
VALL-E-X-Trainer-by-CustomData
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
UniCATS-CTX-txt2vec
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
bark-data-gen
Create training data for training a voice cloner for bark text to speech.
audio-diffusion-pytorch-fork
Audio generation using diffusion models, in PyTorch.
bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices