HaoCheng's repositories
3D-Box-Segment-Anything
We extend Segment Anything to 3D perception by combining it with VoxelNeXt.
aot-benchmark
An efficient modular implementation of Associating Objects with Transformers for Video Object Segmentation in PyTorch
Collaborative_Perception
This repository is a paper digest of recent advances in collaborative / cooperative / multi-agent perception for V2I / V2V / V2X autonomous driving scenario.
ConditionalDETR
This repository is an official implementation of the ICCV 2021 paper "Conditional DETR for Fast Training Convergence". (https://arxiv.org/abs/2108.06152)
DAB-DETR
[ICLR 2022] Official implementation of the paper "DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR"
DeepAccident
Code for the benchmark - DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving.
Deformable-DETR
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
DriveLM
DriveLM: Driving with Graph Visual Question Answering
FastSAM
Fast Segment Anything
futr3d
Code for paper: FUTR3D: a unified sensor fusion framework for 3d detection
Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Stable Diffusion & Tag2Text & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs
H-Deformable-DETR
[CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".
l5kit
L5Kit - https://woven.toyota
LAformer
Official PyTorch Implementation of "LAformer: Trajectory Prediction for Autonomous Driving with Lane-Aware Scene Constraints"
MaskDINO
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
mile
PyTorch code for the paper "Model-Based Imitation Learning for Urban Driving".
MobileSAM
This is the offiicial code for Faster Segment Anything (MobileSAM) project that makes SAM lightweight
PF-Track
Implementation of PF-Track
Segment-and-Track-Anything
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
simple_bev
A Simple Baseline for BEV Perception
Sparse4D
Sparse4D v1 & v2
UniAD
[CVPR 2023 Award Candidate] Planning-oriented Autonomous Driving
unilm
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Video-Swin-Transformer
This is an official implementation for "Video Swin Transformers".
Vim
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
VoxelNeXt
VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking (CVPR 2023)