Dalu Feng's starred repositories
whisper.cpp
Port of OpenAI's Whisper model in C/C++
so-vits-svc
SoftVC VITS Singing Voice Conversion
gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
dreamgaussian
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
pytorch-fid
Compute FID scores with PyTorch.
Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
rhubarb-lip-sync
Rhubarb Lip Sync is a command-line tool that automatically creates 2D mouth animation from voice recordings. You can use it for characters in computer games, in animated cartoons, or in any other project that requires animating mouths based on existing recordings.
unidiffuser
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
2048-python
🐍 2048
CharsiuG2P
Multilingual G2P in 100 languages
PromptingWhisper
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
LoadLoraWithTags
Save/Load trigger words for loras from a json and auto fetch them on civitai if they are missing. Optional prompt input to auto append them (togglable). Actual alphabetical order and print trigger words to terminal. Also bypass toggle to disable without aiming the sliders at 0.
Visual-Audio-Memory
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)
Visual-Audio-Memory
PyTorch implementation of "Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video" (ICCV2021)