Shilong Zhang (jshilong)

jshilong

Geek Repo

Company:The University of Hong Kong (HKU)

Location:Hong Kong

Home Page:https://jshilong.github.io

Github PK Tool:Github PK Tool


Organizations
open-mmlab

Shilong Zhang's starred repositories

Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

Language:PythonLicense:BSD-3-ClauseStargazers:27167Issues:232Issues:656

magic-animate

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Language:PythonLicense:BSD-3-ClauseStargazers:10263Issues:102Issues:143

CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Language:PythonLicense:Apache-2.0Stargazers:5709Issues:66Issues:408

StoryDiffusion

Create Magic Story!

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:5579Issues:85Issues:130

VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Language:PythonLicense:NOASSERTIONStargazers:4375Issues:70Issues:74

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

Language:PythonLicense:MITStargazers:3849Issues:86Issues:94

StableSR

[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution

Language:PythonLicense:NOASSERTIONStargazers:2008Issues:23Issues:136

MusePose

MusePose: a Pose-Driven Image-to-Video Framework for Virtual Human Generation

Language:PythonLicense:NOASSERTIONStargazers:1970Issues:41Issues:55

Lumina-T2X

Lumina-T2X is a unified framework for Text to Any Modality Generation

Language:PythonLicense:MITStargazers:1934Issues:29Issues:78

awesome-openai-vision-api-experiments

Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥

Emu

Emu Series: Generative Multimodal Models from BAAI

Language:PythonLicense:Apache-2.0Stargazers:1576Issues:21Issues:85

Awesome-Video-Diffusion-Models

[Arxiv] A Survey on Video Diffusion Models

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonLicense:MITStargazers:1083Issues:19Issues:41

MimicBrush

Official implementations for paper: Zero-shot Image Editing with Reference Imitation

Language:PythonLicense:Apache-2.0Stargazers:957Issues:13Issues:16

Upscale-A-Video

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Awesome-Controllable-T2I-Diffusion-Models

A collection of resources on controllable generation with text-to-image diffusion models.

rcg

PyTorch implementation of RCG https://arxiv.org/abs/2312.03701

Language:PythonLicense:MITStargazers:774Issues:7Issues:33

Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Language:PythonLicense:Apache-2.0Stargazers:498Issues:35Issues:20

Pandora

Pandora: Towards General World Model with Natural Language Actions and Video States

streamv2v

Official Pytorch implementation of StreamV2V.

Language:PythonLicense:NOASSERTIONStargazers:406Issues:8Issues:6

PoseAnything

A Graph-Based Approach for Category-Agnostic Pose Estimation [ECCV 2024]

Language:PythonLicense:Apache-2.0Stargazers:285Issues:4Issues:10

TalkSHOW

This is the official repository for TalkSHOW: Generating Holistic 3D Human Motion from Speech [CVPR2023].

nosmpl

Accelerated SMPL operation, commonly used in generate 3D human mesh, STAR included.

Language:PythonLicense:GPL-3.0Stargazers:125Issues:3Issues:15

Aurora

Official implementation of Aurora

Language:PythonLicense:NOASSERTIONStargazers:80Issues:7Issues:1

DAC-DETR

[NIPS2023] This is an official implementation of paper "DAC-DETR: Divide the Attention Layers and Conquer".

Language:PythonLicense:MITStargazers:52Issues:1Issues:5

ComfyUI-FlashFace

ComfyUI Node for FlashFace

Language:PythonLicense:MITStargazers:41Issues:1Issues:17
Language:PythonLicense:Apache-2.0Stargazers:15Issues:2Issues:2