sunnnnnnnny's repositories

open-musiclm

Implementation of MusicLM, a text to music model published by Google Research, with a few modifications.

Language:PythonLicense:MITStargazers:1Issues:1Issues:0

AdaSpeech

An implementation of Microsoft's "AdaSpeech: Adaptive Text to Speech for Custom Voice"

Language:PythonStargazers:0Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

Awesome-Singing-Voice-Synthesis-and-Singing-Voice-Conversion

A paper and project list about the cutting edge Speech Synthesis, Text-to-Speech (TTS), Singing Voice Synthesis (SVS), Voice Conversion (VC), Singing Voice Conversion (SVC), and related interesting works (such as Music Synthesis, Automatic Music Transcription, Automatic MOS Prediction, SSL-based ASR...etc).

Stargazers:0Issues:0Issues:0

CDFSE_FastSpeech2

The Official Implementation of “Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis”

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

Chinese-FastSpeech2

基于标贝数据继续训练,同时对原本的FastSpeech2模型做了改进,引入了韵律表征以及韵律预测模块,使中文发音更生动且富有节奏

Language:PythonStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Language:HTMLStargazers:0Issues:1Issues:1

text

my first

Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

chinese_speech_pretrain

chinese speech pretrained models

Stargazers:0Issues:0Issues:0

DALL-E

PyTorch package for the discrete VAE used for DALL·E.

License:NOASSERTIONStargazers:0Issues:0Issues:0

denoiser

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder archi

License:NOASSERTIONStargazers:0Issues:0Issues:0

espnet_onnx

Onnx wrapper for espnet infrernce model

License:MITStargazers:0Issues:0Issues:0

g2p

g2p: English Grapheme To Phoneme Conversion

License:Apache-2.0Stargazers:0Issues:0Issues:0

jets

JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

paper-reading

深度学习经典、新论文逐段精读

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

polyphone

Chinese polyphone disambiguation for Text-to-Speech application

Stargazers:0Issues:0Issues:0

riffusion

Stable diffusion for real-time music generation

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

tacotron2

Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

tacotron2-emo

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

tacotron2-nvidia

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

VISinger2

VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer

Language:PythonStargazers:0Issues:0Issues:0

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

License:MITStargazers:0Issues:0Issues:0

w2v2-how-to

How to use our public wav2vec2 dimensional emotion model

License:MITStargazers:0Issues:0Issues:0