chenht2010's repositories
textlesslib
Library for Textless Spoken Language Processing
AlignSTS
Findings of ACL 2023 | AlignSTS: a speech-to-singing (STS) model based on modality disentanglement and cross-modal alignment
awesome-whisper
🔊 Awesome list for Whisper — an open-source AI-powered speech recognition system developed by OpenAI
bytecover
Implementation of "Bytecover: Cover song identification via multi-loss training" paper (ICASSP 2021)
encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Music-Source-Separation-Training
Repository for training models for music source separation.
MVSEP-MDX23-music-separation-model
Model for MDX23 music separation contest
naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
soundstorm-pytorch
Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch
Speech-Separation-Paper-Tutorial
A must-read paper for speech separation based on neural networks
ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
vocalsound
Dataset and baseline code for the VocalSound dataset (ICASSP2022).
VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch