haoxiaoyang444's repositories
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
tortoise-tts
A multi-voice TTS system trained with an emphasis on quality
TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
UniAudio
The Open Source Code of UniAudio
so-vits-svc-5.0
Core Engine of Singing Voice Conversion & Singing Voice Clone
AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
NATSpeech
A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)
encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
repgan
RepVgg + HiFiGAN
SpecVQGAN
Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
pytorch_wavelets
Pytorch implementation of 2D Discrete Wavelet (DWT) and Dual Tree Complex Wavelet Transforms (DTCWT) and a DTCWT based ScatterNet
hifi-gan
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
multiband_melgan
An unofficial implementation of https://arxiv.org/abs/2005.05106