Wangzhen-kris

followers

following

stars

Wangzhen's starred repositories

ChatTTS

A generative speech model for daily dialogue.

Language:PythonAGPL-3.02885900

tts-frontend-dataset

TTS FrontEnd DataSet: Polyphone / Prosody / TextNormalization

Language:PythonApache-2.07600

megatts2

Unoffical implementation of Megatts2

Language:PythonMIT24700

WhisperSpeech

An Open Source text-to-speech system built by inverting Whisper.

Language:Jupyter NotebookMIT364200

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonMIT114400

Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Language:Jupyter NotebookMIT53700

StableTTS

Next-generation TTS model using flow-matching and DiT, inspired by Stable Diffusion 3

Language:PythonMIT29300

MeloTTS

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Language:PythonMIT418300

fish-speech

Brand new TTS solution

Language:PythonNOASSERTION696600

Prosody_Prediction

Predict prosody labels for Chinese sentences.

Language:Python4000

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonMPL-2.03250600

TTS-TextAnalyzer

TTS Text Analyzer

Apache-2.03200

chinese_speech_pretrain

chinese speech pretrained models

Language:Shell98400

voice-changer

リアルタイムボイスチェンジャー Realtime Voice Changer

Language:PythonNOASSERTION1560200

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Language:PythonMIT453300

Meta-voicebox

Implementation of Meta-Voicebox : The first generative AI model for speech to generalize across tasks with state-of-the-art performance.

MIT54700

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Language:Python54500

naturalspeech2-pytorch

Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch

Language:PythonMIT124600

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookMIT3411100

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Language:PythonApache-2.01074500

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookApache-2.0686400

lora-svc

singing voice change based on whisper, and lora for singing voice clone

Language:PythonMIT61100

KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Language:PythonMIT47400

diff-svc

Singing Voice Conversion via diffusion model

Language:Jupyter NotebookAGPL-3.0261000

BigVGAN

Official PyTorch implementation of BigVGAN (ICLR 2023)

Language:PythonMIT81500

adaptive_voice_conversion

Language:PythonApache-2.046700

nix-tts

🐤 Nix-TTS: Lightweight and End-to-end Text-to-Speech via Module-wise Distillation

Language:PythonMIT23000

nonparaSeq2seqVC_code

Implementation code of non-parallel sequence-to-sequence VC

Language:PythonMIT24700

ppg-vc

PPG-Based Voice Conversion

Language:PythonApache-2.032100

tensorrtx

Implementation of popular deep learning networks with TensorRT network definition API

Language:C++MIT680000