Xiangtai Li's repositories
Awesome-Segmentation-With-Transformer
[Arxiv-04-2023] Transformer-Based Visual Segmentation: A Survey
DecoupleSegNets
[ECCV-2020]: Improving Semantic Segmentation via Decoupled Body and Edge Supervision
Video-K-Net
[CVPR-2022 (oral)]-Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation
Panoptic-PartFormer
[ECCV-2022] The First Unified End-to-End System for Panoptic Part Segmentation
TemporalPyramidRouting
Temporal Pyramid Routing For Video Instance Segmentation-T-PAMI-2022
QueryPanSeg
Query Learning of Both Thing and Stuff for Panoptic Segmentation-ICIP-2022
awesome-3D-gaussian-splatting
Curated list of papers and resources focused on 3D Gaussian Splatting, intended to keep pace with the anticipated surge of research in the coming months.
lxtGH.github.io
AcadHomepage: A Modern and Responsive Academic Personal Homepage
DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Generalizable-Mixture-of-Experts
GMoE could be the next backbone model for many kinds of generalization task.
InternGPT
InternGPT / InternChat allows you to interact with ChatGPT by clicking, dragging and drawing using a pointing device.
latent-diffusion
High-Resolution Image Synthesis with Latent Diffusion Models
LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
ov-seg
This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.
PixArt-alpha
Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
PointNeXt
[NeurIPS'22] PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies
RefT
code for "Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation"
segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
TransVOD
The repository is the code for the paper "End-to-End Video Object Detection with Spatial-TemporalTransformers"