tsaifangsheng's repositories

AudioLDM

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

AuxiliaryASR

Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

bddm

BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

bark

🔊 Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

CDiffuSE

Conditional Diffusion Probabilistic Model for Speech Enhancement

License:Apache-2.0Stargazers:0Issues:0Issues:0

Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ControlNet

Let us control diffusion models!

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

diffsptk

A differential version of SPTK

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

diffusion_distiller

🚀 PyTorch Implementation of "Progressive Distillation for Fast Sampling of Diffusion Models(v-diffusion)"

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

FastDiff

PyTorch Implementation of FastDiff (IJCAI'22)

Language:PythonStargazers:0Issues:0Issues:0

GeneFace

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

google-research

Google Research

License:Apache-2.0Stargazers:0Issues:0Issues:0

GST-Tacotron

A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

License:MITStargazers:0Issues:0Issues:0

iSTFTNet-pytorch

iSTFTNet : Fast and Lightweight Mel-spectrogram Vocoder Incorporating Inverse Short-time Fourier Transform

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

MB-iSTFT-VITS

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

MSMC-TTS

Official Implement of Multi-Stage Multi-Codebook (MSMC) TTS

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

NeuralSVB

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Language:PythonStargazers:0Issues:0Issues:0

nix-tts

🐤 Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation

License:MITStargazers:0Issues:0Issues:0

nnsvs

Neural network-based singing voice synthesis library for research

License:MITStargazers:0Issues:0Issues:0

PitchExtractor

Deep Neural Pitch Extractor for Voice Conversion and TTS Training

License:MITStargazers:0Issues:0Issues:0

ProDiff

PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

self-supervised-phone-segmentation

Phoneme segmentation using pre-trained speech models

Language:PythonLicense:GPL-3.0Stargazers:0Issues:0Issues:0

stable-diffusion

A latent text-to-image diffusion model

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:0Issues:0Issues:0

StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

License:MITStargazers:0Issues:0Issues:0

state-spaces

Sequence Modeling with Structured State Spaces

License:Apache-2.0Stargazers:0Issues:0Issues:0

StyleFlow

StyleFlow: Attribute-conditioned Exploration of StyleGAN-generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

valle

Zero-Shot Text-To-Speech

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0