Show Lab (showlab)

Show Lab

showlab

Organization data from Github https://github.com/showlab

Home Page:https://sites.google.com/view/showlab

GitHub:@showlab

Show Lab's repositories

Awesome-Video-Diffusion

A curated list of recent diffusion models for video generation, editing, and various other applications.

computer_use_ootb

Out-of-the-box (OOTB) GUI Agent for Windows and macOS

Language:PythonLicense:Apache-2.0Stargazers:1671Issues:20Issues:59

ShowUI

[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Language:PythonLicense:Apache-2.0Stargazers:1472Issues:15Issues:67

Show-o

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Language:PythonLicense:Apache-2.0Stargazers:1333Issues:17Issues:54

Awesome-GUI-Agent

💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.

Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

VLog

[CVPR 2025] Video Narration as Vocabulary & Video as Long Document

Language:PythonLicense:MITStargazers:579Issues:8Issues:13

Awesome-Unified-Multimodal-Models

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

PhotoDoodle

Code Implementation of "PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data"

Language:PythonLicense:MITStargazers:373Issues:4Issues:14

MakeAnything

Official code of "MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation"

Language:PythonLicense:MITStargazers:171Issues:4Issues:4

MovieAgent

MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning

Language:PythonStargazers:166Issues:0Issues:0

FAR

Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"

Language:PythonLicense:MITStargazers:164Issues:4Issues:2

Awesome-Robotics-Diffusion

(In progress) A curated list of recent robot learning papers incorporating diffusion models for robotics tasks.

ROICtrl

Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation

LOVA3

(NeurIPS 2024) Official PyTorch implementation of LOVA3

Language:PythonStargazers:90Issues:5Issues:0
Language:PythonStargazers:64Issues:0Issues:0

GUI-Thinker

Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.

Language:PythonStargazers:55Issues:0Issues:0

MovieBench

[CVPR 2025] A Hierarchical Movie Level Dataset for Long Video Generation

Language:PythonStargazers:52Issues:4Issues:0

FQGAN

FQGAN: Factorized Visual Tokenization and Generation

Language:PythonLicense:NOASSERTIONStargazers:47Issues:4Issues:1
Language:PythonLicense:Apache-2.0Stargazers:37Issues:1Issues:1

MovieSeq

[ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences

Language:Jupyter NotebookStargazers:36Issues:3Issues:2

VideoGUI

[NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Language:JavaScriptStargazers:33Issues:4Issues:0

SMS

Balanced Image Stylization with Style Matching Score

DoraCycle

[CVPR 2025] DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Stargazers:19Issues:0Issues:0

TPDiff

TPDiff: Temporal Pyramid Video Diffusion Model

Stargazers:19Issues:0Issues:0

DiffSim

Official repository of DiffSim: Taming Diffusion Models for Evaluating Visual Similarity

UniMoD

The code repository of UniMoD

License:Apache-2.0Stargazers:2Issues:0Issues:0
Language:Jupyter NotebookStargazers:2Issues:0Issues:0
Stargazers:0Issues:0Issues:0