cythc's starred repositories
so-vits-svc
SoftVC VITS Singing Voice Conversion
Bert-VITS2
vits2 backbone with multilingual-bert
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
HierSpeechpp
The official implementation of HierSpeech++
chinese_speech_pretrain
chinese speech pretrained models
Meta-voicebox
Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.
string2string
String-to-String Algorithms for Natural Language Processing
XPhoneBERT
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)
SyntaSpeech
SyntaSpeech: Syntax-aware Generative Adversarial Text-to-Speech; IJCAI 2022; Official code
whisper-vits-japanese
Vits Japanese with Whisper as data processor (you can train your VITS even you only have audios)
AuxiliaryASR
Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)
TransferTTS
TransferTTS (Zero-Shot learning of VITS)
SpeechTasks
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.
silk-codec
Silk coder; Encode audio to silk; Decode silk to PCM