SheffieldCao

followers

following

stars

Tongji Univ

Shanghai, China

xucaotju@gmail.com

Xu CAO's repositories

Co-DETR

[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training

MIT000

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

NOASSERTION000

Far3D

[AAAI2024] Far3D: Expanding the Horizon for Surround-view 3D Object Detection

NOASSERTION000

DiffIR

This project is the official implementation of 'Diffir: Efficient diffusion model for image restoration', ICCV2023

000

UniAD

[CVPR 2023 Best Paper] Planning-oriented Autonomous Driving

Apache-2.0000

mmagic

OpenMMLab Image and Video Restoration, Editing and Generation Toolbox

Language:Jupyter NotebookApache-2.0000

mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Language:PythonApache-2.0000

SheffieldCao

Config files for my GitHub profile.

000

ODISE

Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]

NOASSERTION000

fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

Apache-2.0000

OVO-Open-Vocabulary-Occupancy

Apache-2.0000

ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions

Apache-2.0000

ov-seg

This is the official PyTorch implementation of the paper Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP.

NOASSERTION000

VLDet

[ICLR 2023] PyTorch implementation of VLDet （https://arxiv.org/abs/2211.14843）

000

CAT-Seg

Official Implementation of "CAT-Seg🐱: Cost Aggregation for Open-Vocabulary Semantic Segmentation"

Language:Python000

VoxFormer

Official PyTorch implementation of VoxFormer [CVPR 2023 Highlight]

NOASSERTION000

Lite-Mono

Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

MIT000

Occ3D

MIT000

stable-dreamfusion

A pytorch implementation of text-to-3D dreamfusion, powered by stable diffusion.

Apache-2.0000

SurroundOcc

Multi-camera 3D Occupancy Prediction for Autonomous Driving

Apache-2.0000

Multimodal-GPT

Multimodal-GPT

Apache-2.0000

mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

Apache-2.0000

Occ3DBaseline

CVPR2023-Occupancy-Prediction-Challenge

MIT000

SAN

Open-vocabulary Semantic Segmentation

MIT000

Semantic-Segment-Anything

Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).

Apache-2.0000

clip-interrogator

Image to prompt with BLIP and CLIP

MIT000

sheffield.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

MIT000

PolarFormer

[AAAI 2023] PolarFormer: Multi-camera 3D Object Detection with Polar Transformers

MIT000

BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Apache-2.0000

mmdet-learning

Language:PythonApache-2.0000