Jou-ching (George) Sung's starred repositories
voice-changer
リアルタイムボイスチェンジャー Realtime Voice Changer
LocalAI
:robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.
openWakeWord
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
llm-datasets
High-quality datasets, tools, and concepts for LLM fine-tuning.
streamlit-audio-recorder
Record Audio from the User's Microphone in Apps that are Deployed to the Web. (via Browser Media-API, REACT-based, Streamlit Custom Component)
tts-generation-webui
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
streamlit-webrtc
Real-time video and audio streams over the network, with Streamlit.
distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
diffusion-fast
Faster generation with text-to-image diffusion models.
parler-tts
Inference and training library for high-quality TTS models.
InstantStyle
InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation 🔥
tryondiffusion
PyTorch implementation of "TryOnDiffusion: A Tale of Two UNets", a virtual try-on diffusion-based network by Google
sd-forge-layerdiffuse
[WIP] Layer Diffusion for WebUI (via Forge)