lzc's repositories
annotated_deep_learning_paper_implementations
🧑🏫 59 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
Comprehensive-E2E-TTS
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
DeepLearning-500-questions
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,近30万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Forked and maintained by the OpenVPI community
FastSpeech2
with alignment learning and without preprocessing
LeetCode-Python
LeetCode solutions in Python2. LeetCode题解 in Python2。
libri-light
dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.
MnTTS2
NCMMSC'2022
s3prl
Self-Supervised Speech/Sound Pre-training and Representation Learning Toolkit
SC_VALL-E
Style-Controllable Zero-Shot Text to Speech Synthesizer based on VALL-E
Text-to-sound-Synthesis
The source code of our paper "Diffsound: discrete diffusion model for text-to-sound generation"
vall-e
An unofficial PyTorch implementation of the audio LM VALL-E, WIP
VendingMachine_In_BASYS2
Project for ES003
Voice-Reproduce
Chinese real time voice cloning (VC) and Chinese text to speech (TTS). 好用的中文语音克隆兼中文语音合成系统,包含语音编码器、语音合成器、声码器和可视化模块。
voicesmith
[WIP] VoiceSmith makes training text to speech models easy.
WaveRNN
WaveRNN Vocoder + TTS
zhvoice
Chinese voice corpus. 中文语音语料,语音更加清晰自然,包含8个开源数据集,3200个说话人,900小时语音,1300万字。