sunnnnnnnny's repositories

Language:PythonStargazers:0Issues:0Issues:0

XPhoneBERT

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

License:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

Tacotron2-PyTorch

Yet another PyTorch implementation of Tacotron 2 with reduction factor and faster training speed.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Language:PythonLicense:MITStargazers:0Issues:0Issues:0

VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io

License:MITStargazers:0Issues:0Issues:0

AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Stargazers:0Issues:0Issues:0

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

ltu

Github Repo for Paper "Listen, Think, and Understand".

Stargazers:0Issues:0Issues:0

ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

License:MITStargazers:0Issues:0Issues:0

lora-svc

singing voice change based on whisper, and lora for singing voice clone

License:MITStargazers:0Issues:0Issues:0

CLAP

Learning audio concepts from natural language supervision

License:MITStargazers:0Issues:0Issues:0

bark

🔊 Text-Prompted Generative Audio Model

License:MITStargazers:0Issues:0Issues:0

melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

License:MITStargazers:0Issues:0Issues:0

DiffSinger

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

License:MITStargazers:0Issues:0Issues:0

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit

License:Apache-2.0Stargazers:0Issues:0Issues:0

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

License:MITStargazers:0Issues:0Issues:0

g2pW_chinese

Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)

License:Apache-2.0Stargazers:0Issues:0Issues:0

vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

License:Apache-2.0Stargazers:0Issues:0Issues:0

FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

License:MITStargazers:0Issues:0Issues:0

StarGAN-Voice-Conversion-2

A pytorch implementation of StarGAN-VC2

Stargazers:0Issues:0Issues:0

pytorch-StarGAN-VC

Fully reproduce the paper of StarGAN-VC. Stable training and Better audio quality .

Stargazers:0Issues:0Issues:0

polyphone-g2pL

The implementation of g2pL with a new open dataset.

License:Apache-2.0Stargazers:0Issues:0Issues:0

INTERSPEECH-2023-Papers

INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

License:MITStargazers:0Issues:0Issues:0

TTS-TextAnalyzer

TTS Text Analyzer

License:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

License:MITStargazers:0Issues:0Issues:0

FastSpeech_sing

The Implementation of FastSpeech based on pytorch.

License:MITStargazers:0Issues:0Issues:0