Xiangtai Li's starred repositories
Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
kmax-deeplab
a PyTorch re-implementation of ECCV 2022 paper based on Detectron2: k-means mask Transformer.
Prompt-Diffusion
Official PyTorch implementation of the paper "In-Context Learning Unlocked for Diffusion Models"
StreamPETR
[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Point-In-Context
[NeurIPS2023] Implementation of the paper: Explore In-Context Learning for 3D Point Cloud Understanding
OmniObject3D
[ CVPR 2023 Award Candidate ] OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation
ContextDET
Contextual Object Detection with Multimodal Large Language Models
InternVideo
Video Foundation Models & Data for Multimodal Understanding
InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
learning_research
本人的科研经验
open_flamingo
An open-source framework for training large multimodal models.
SegmentAnyRGBD
Segment Any RGBD
Multimodal-GPT
Multimodal-GPT