Z-L-D

followers

following

stars

Z-L-D's starred repositories

taggui

Tag manager and captioner for image datasets

Language:PythonGPL-3.052300

airgen

Official source codes of airsep

Language:PythonMIT2900

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2028400

De-limiter

An official repository of "Music De-limiter Networks via Sample-wise Gain Inversion", which will be presented in WASPAA 2023.

Language:PythonMIT6400

whole-song-gen

Language:PythonMIT1900

Stable-Diffusion

Stable Diffusion, SDXL, LoRA Training, DreamBooth Training, Automatic1111 Web UI, DeepFake, Deep Fakes, TTS, Animation, Text To Video, Tutorials, Guides, Lectures, Courses, ComfyUI, Google Colab, RunPod, NoteBooks, ControlNet, TTS, Voice Cloning, AI, AI News, ML, ML News, News, Tech, Tech News, Kohya LoRA, Kandinsky 2, DeepFloyd IF, Midjourney

Language:Jupyter NotebookGPL-3.0190700

PixArt-alpha

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Language:PythonAGPL-3.0257500

PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Language:PythonAGPL-3.0149600

3DGPT

rich-text-to-image

Rich-Text-to-Image Generation

Language:PythonMIT74800

sd-webui-rich-text

Language:Python11700

ELLA

ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment

Language:PythonApache-2.0100500

LaVi-Bridge

[ECCV 2024] Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation

Language:PythonMIT29400

DiLightNet

Official Code Release for [SIGGRAPH 2024] DilightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Language:PythonMIT5700

StabilityMatrix

Multi-Platform Package Manager for Stable Diffusion

Language:C#AGPL-3.0345900

RaDe-GS

RaDe-GS: Rasterizing Depth in Gaussian Splatting

Language:C++NOASSERTION38800

audio-diffusion

Apply diffusion models using the new Hugging Face diffusers package to synthesize music instead of images.

Language:Jupyter NotebookGPL-3.068600

part123

https://liuar0512.github.io/part123_official_page/

MIT3000

stmc

Implementation of "Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation" from CVPR Workshop on Human Motion Generation 2024.

Language:PythonNOASSERTION6700

BlockFusion

[TOG 2024] BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

1600

Unique3D

Official implementation of Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image

Language:PythonMIT254400

MotionDreamer

MotionDreamer: Zero-Shot 3D Mesh Animation from Video Diffusion Models

1500

GaussianPrediction

[SIGGRAPH Conference 2024] GaussianPrediction: Dynamic 3D Gaussian Prediction for Motion Extrapolation and Free View Synthesis

3600

ComfyUI-DynamiCrafterWrapper

Wrapper to use DynamiCrafter models in ComfyUI

Language:PythonNOASSERTION53800

Omost

Your image is almost there!

Language:PythonApache-2.0694700

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonNOASSERTION195800

ComfyUI-FlashFace

ComfyUI Node for FlashFace

Language:PythonMIT4100

FlashFace

Language:PythonMIT29200

threefiner

An interface for text-guided mesh refinement.

Language:PythonApache-2.016100

Analogist

Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model (SIGGRAPH 2024)

Language:PythonMIT2800