yztongzhan

Zhan Tong's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookApache-2.044604 294 640

DragGAN

Official Code for DragGAN (SIGGRAPH 2023)

Language:PythonNOASSERTION35246 1002 183

codellama

Inference code for CodeLlama models

Language:PythonNOASSERTION15251 173 187

open-gpu-kernel-modules

NVIDIA Linux open GPU kernel module source

Language:CNOASSERTION14049 171 306

CoDeF

[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

Language:PythonNOASSERTION4773 73 78

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

Language:PythonApache-2.03145 43 49

TigerBot

TigerBot: A multi-language multi-task LLM

Language:PythonApache-2.02215 31 124

DiffusionDet

[ICCV2023 Oral] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)

Language:PythonNOASSERTION2001 17 110

MAT

MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Language:PythonNOASSERTION697 10 112

VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

Language:PythonMIT423 6 46

hmr-survey

[TPAMI 2023] Recovering 3D Human Mesh from Monocular Images: A Survey

331 15 5

AdaptFormer

[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"

Language:PythonMIT294 7 35

SparseBEV

[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos

Language:PythonMIT283 9 69

Occupancy-MAE

Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders

Language:PythonApache-2.0236 7 30

UM-MAE

Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"

Language:Jupyter NotebookNOASSERTION231 5 22

EDT

On Efficient Transformer-Based Image Pre-training for Low-Level Vision

Language:Python121 14 11

SportsMOT

[ICCV 2023] SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes

Language:Python120 5 14

GroupMixFormer

GroupMixAttention and GroupMixFormer

Language:PythonMIT107 9 4

AVION

Code release for "Training a Large Video Model on a Single Machine in a Day"

Language:PythonMIT96 1 10

TeSTra

Code for ECCV2022 "Real-time Online Video Detection with Temporal Smoothing Transformers"

Language:PythonApache-2.092 2 10

MetaBEV

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

MIT83 6 6

sparseformer

(ICLR 2024, CVPR 2024) SparseFormer

Language:PythonMIT61 9 2

DDM

[CVPR 2022] Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection

Language:PythonMIT47 2 10

STMixer

[CVPR 2023] STMixer: A One-Stage Sparse Action Detector

Language:Python46 1 4

VideoMAE-Action-Detection

[NeurIPS 2022 Spotlight] VideoMAE for Action Detection

Language:PythonNOASSERTION42 2 5

EVAD

[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement

Language:PythonNOASSERTION19 2 4

ZeroI2V

Official implementation of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video"

Language:PythonApache-2.012 4 4

SNCLR

[ICLR 2023] Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning

Language:Python11 2 2

chatgpt_mini_helper

My customized GPT 3.5 helper

Language:Python7 20

VideoMAE

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

Language:PythonNOASSERTION1 10