zwf's starred repositories
LLaMA-Omni
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
cog-flux-dev-inpainting
🎨 Fill in masked parts of images with FLUX.1-dev 🖌️
MakerSkillTree
A repository of Maker Skill Trees and templates to make your own.
email-reply-parser
:email: Email reply parser library for Python
matmulfreellm
Implementation for MatMul-free LM.
LLM-Merging
LLM-Merging: Building LLMs Efficiently through Merging
schedule_free
Schedule-Free Optimization in PyTorch
GigaSpeech
Large, modern dataset for speech recognition
VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)