Show Lab's repositories
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, and various other applications.
computer_use_ootb
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
Awesome-GUI-Agent
đź’» A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
Awesome-MLLM-Hallucination
đź“– A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
Awesome-Unified-Multimodal-Models
đź“– This is a repository for organizing papers, codes and other resources related to unified multimodal models.
Awesome-Robotics-Diffusion
A curated list of recent robot learning papers incorporating diffusion models for robotics tasks.
MakeAnything
Official code of "MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation"
sparseformer
(ICLR 2024, CVPR 2024) SparseFormer
MovieBench
[CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation
EvolveDirector
[NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
LayerTracer
Official code of "LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer"
IDProtector
The code implementation of **IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation**.
VisInContext
Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning
Tune-An-Ellipse
[CVPR 2024] Tune-An-Ellipse: CLIP Has Potential to Find What You Want