sunnnnnnnny's repositories

TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Language:PythonLicense:MPL-2.0Stargazers:1Issues:0Issues:0

Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0

AuxiliaryASR

Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

BBDown

Bilibili Downloader. 一款命令行式哔哩哔哩下载器.

Language:C#License:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:1Issues:0

hifi-gan

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ChatTTS

ChatTTS is a generative speech model for daily dialogue.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

emotion2vec

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Language:PythonStargazers:0Issues:0Issues:0

fish-speech

Brand new TTS solution

Language:PythonLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0

GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

HierSpeechpp

The official implementation of HierSpeech++

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

megatts2

Unoffical implementation of Megatts2

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

NS2VC

Unofficial implementation of NaturalSpeech2 for Voice Conversion and Text to Speech

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

OpenVoice

Instant voice cloning by MyShell

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

PitchExtractor

Deep Neural Pitch Extractor for Voice Conversion and TTS Training

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

StyleTTS

Official Implementation of StyleTTS

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Tacotron2-PyTorch

Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

tortoise-tts

A multi-voice TTS system trained with an emphasis on quality

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Transformer-TTS

A Pytorch Implementation of "Neural Speech Synthesis with Transformer Network"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:1Issues:0

versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

License:MITStargazers:0Issues:0Issues:0

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

XPhoneBERT

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0