sunxh16's repositories
AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
async_cosyvoice
使用vllm加速cosyvoice2的推理
book-text-to-speech
A book about Text-to-Speech (TTS) in Chinese.
Concatenate_wav
Concatenate wavs(for unit selection)
CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
F5-TTS
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
FastSpeech2
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
FloWaveNet
A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
NeuralVoicePuppetry
This github contains the network architectures of NeuralVoicePuppetry.
nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
onnxruntime
ONNX Runtime
ParallelWaveGAN
Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN) with Pytorch
Python-Wrapper-for-World-Vocoder
A Python wrapper for the high-quality vocoder "World"
rigl
End-to-end training of sparse deep neural networks with little-to-no performance loss.
seed-vc
zero-shot voice conversion & singing voice conversion, with real-time support
SincNet
SincNet is a neural architecture for efficiently processing raw audio samples.
so-vits-svc
SoftVC VITS Singing Voice Conversion
sp2si-code
Contains code for our work on speech to singing conversion (ICASSP 2020)
tacotron2_v1
DeepMind's Tacotron-2 Tensorflow implementation
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
vits
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
wav2letter
Facebook AI Research Automatic Speech Recognition Toolkit