Z-L-D

followers

following

stars

Z-L-D's starred repositories

taggui

Tag manager and captioner for image datasets

Language:PythonGPL-3.061900

airgen

Official source codes of airsep

Language:PythonMIT3300

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2052000

De-limiter

An official repository of "Music De-limiter Networks via Sample-wise Gain Inversion", which will be presented in WASPAA 2023.

Language:PythonMIT6500

whole-song-gen

Language:PythonMIT2300

Stable-Diffusion

Stable Diffusion, SDXL, LoRA Training, DreamBooth Training, Automatic1111 Web UI, DeepFake, Deep Fakes, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya LoRA, Kandinsky 2, DeepFloyd IF, Midjourney

Language:Jupyter NotebookGPL-3.0198500

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.0265100

PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Language:PythonAGPL-3.0156800

3DGPT

rich-text-to-image

Rich-Text-to-Image Generation

Language:PythonMIT75000

sd-webui-rich-text

Language:Python11800

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonApache-2.0103600

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonMIT30000

DiLightNet

Official Code Release for [SIGGRAPH 2024] DilightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Language:PythonMIT8400

StabilityMatrix

Multi-Platform Package Manager for Stable Diffusion

Language:C#AGPL-3.0421400

RaDe-GS

RaDe-GS: Rasterizing Depth in Gaussian Splatting

Language:C++NOASSERTION43300

audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.

Language:Jupyter NotebookGPL-3.069400

part123

https://liuar0512.github.io/part123_official_page/

Language:PythonMIT3400

stmc

Implementation of "Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation" from CVPR Workshop on Human Motion Generation 2024.

Language:PythonNOASSERTION7600

BlockFusion

[TOG 2024] BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

1600

Unique3D

Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Language:PythonMIT278600

MotionDreamer

MotionDreamer: Zero-Shot 3D Mesh Animation from Video Diffusion Models

1700

GaussianPrediction

[SIGGRAPH Conference 2024] GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

4500

ComfyUI-DynamiCrafterWrapper

Wrapper to use DynamiCrafter models in ComfyUI

Language:PythonNOASSERTION57900

Omost

Your image is almost there!

Language:PythonApache-2.0714000

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonNOASSERTION206200

ComfyUI-FlashFace

ComfyUI Node for FlashFace

Language:PythonMIT6100

FlashFace

Language:PythonMIT34300

threefiner

An interface for text-guided mesh refinement.

Language:PythonApache-2.016900

Analogist

Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model (SIGGRAPH 2024)

Language:PythonMIT2900