Xu CAO's repositories
Co-DETR
[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training
xformers
Hackable and optimized Transformers building blocks, supporting a composable construction.
Far3D
[AAAI2024] Far3D: Expanding the Horizon for Surround-view 3D Object Detection
DiffIR
This project is the official implementation of 'Diffir: Efficient diffusion model for image restoration', ICCV2023
UniAD
[CVPR 2023 Best Paper] Planning-oriented Autonomous Driving
mmagic
OpenMMLab Image and Video Restoration, Editing and Generation Toolbox
mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
SheffieldCao
Config files for my GitHub profile.
ODISE
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
fromage
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
ViT-Adapter
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
ov-seg
This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.
VLDet
[ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)
CAT-Seg
Official Implementation of "CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation"
VoxFormer
Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]
Lite-Mono
Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
stable-dreamfusion
A pytorch implementation of text-to-3D dreamfusion, powered by stable diffusion.
SurroundOcc
Multi-camera 3D Occupancy Prediction for Autonomous Driving
Multimodal-GPT
Multimodal-GPT
mmpretrain
OpenMMLab Pre-training Toolbox and Benchmark
Occ3DBaseline
CVPR2023-Occupancy-Prediction-Challenge
SAN
Open-vocabulary Semantic Segmentation
Semantic-Segment-Anything
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
clip-interrogator
Image to prompt with BLIP and CLIP
sheffield.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
PolarFormer
[AAAI 2023] PolarFormer: Multi-camera 3D Object Detection with Polar Transformers
BEVFormer
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.