Z-L-D's starred repositories
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
De-limiter
An official repository of "Music De-limiter Networks via Sample-wise Gain Inversion", which will be presented in WASPAA 2023.
Stable-Diffusion
Stable Diffusion, SDXL, LoRA Training, DreamBooth Training, Automatic1111 Web UI, DeepFake, Deep Fakes, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya LoRA, Kandinsky 2, DeepFloyd IF, Midjourney
PixArt-alpha
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
PixArt-sigma
PixArt-ÎŁ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
rich-text-to-image
Rich-Text-to-Image Generation
LaVi-Bridge
[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation
DiLightNet
Official Code Release for [SIGGRAPH 2024] DilightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
StabilityMatrix
Multi-Platform Package Manager for Stable Diffusion
audio-diffusion
Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.
BlockFusion
[TOG 2024] BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation
MotionDreamer
MotionDreamer: Zero-Shot 3D Mesh Animation from Video Diffusion Models
GaussianPrediction
[SIGGRAPH Conference 2024] GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis
ComfyUI-DynamiCrafterWrapper
Wrapper to use DynamiCrafter models in ComfyUI
ComfyUI-FlashFace
ComfyUI Node for FlashFace
threefiner
An interface for text-guided mesh refinement.