HAESUNG JEON (chad.plus)'s starred repositories
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
pyAudioAnalysis
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Qwen-Audio
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
HierSpeechpp
The official implementation of HierSpeech++
Real-Time-Latent-Consistency-Model
App showcasing multiple real-time diffusion models pipelines with Diffusers
speech-denoising-wavenet
A neural network for end-to-end speech denoising
normalizing-flows
PyTorch implementation of normalizing flow models
ZeroSpeech
VQ-VAE for Acoustic Unit Discovery and Voice Conversion
ai-audio-datasets-list
This is a list of datasets consisting of speech, music, and sound effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio applications. It is mainly used for speech recognition, speech synthesis, singing voice synthesis, music information retrieval, music generation, etc.
pflowtts_pytorch
Unofficial implementation of NVIDIA P-Flow TTS paper
awesome-voice-conversion
A curated list of awesome voice conversion, projects and communities.
HPMDubbing
[CVPR 2023] Official code for paper: Learning to Dub Movies via Hierarchical Prosody Models.
character-factory
Generate characters for SillyTavern, TavernAI, TextGenerationWebUI using LLM and Stable Diffusion