Mraj96's repositories
Bert-VITS2
vits2 backbone with multilingual-bert
ConsistencyVC-voive-conversion
Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion
DDSP-SVC
Real-time end-to-end singing voice conversion system based on DDSP (Differentiable Digital Signal Processing)
e2-tts-pytorch
Implementation of E2-TTS, "Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS", in Pytorch
F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
F5-TTS-with-ForcedAlignment-DurationPredictor
Based on Official code of "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching". This work uses phoneme-level forced alignment to stabilize the generation process.
fish-diffusion
An easy to understand TTS / SVS / SVC framework
free-svc
[ICASSP 2025] FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion
generative-ai-for-beginners
12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
glow-svc
singing voice conversion based on glow-tts
Grad-SVC
Singing Voice Conversion based on Grad-TTS. The core algorithm is diffusion.
hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
lora-svc
singing voice change based on whisper, and lora for singing voice clone
PitchVC
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
pits
PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor
PPG-GradVC
A diffusion-based cross-lingual voice conversion model, as my bachelor's thesis
QuickVC-VoiceConversion
QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Retrieval-based-Voice-Conversion-WebUI
Voice data <= 10 mins can also be used to train a good VC model!
s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit
so-vits-svc-2
SoftVC VITS Singing Voice Conversion
so-vits-svc-fork
so-vits-svc fork with realtime support, improved interface and more features.
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild