raoyongming

followers

following

stars

Tencent

https://raoyongming.github.io/

Yongming Rao's starred repositories

instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Language:CudaNOASSERTION15697 205 1010

CVPR2023-Papers-with-Code

CVPR 2023 论文和开源项目合集

open_clip

An open source implementation of CLIP.

Language:PythonNOASSERTION9468 77 458

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Language:PythonNOASSERTION7053 58 187

dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Language:PythonApache-2.06137 68 246

guided-diffusion

Language:PythonMIT5963 142 138

ConvNeXt

Code release for ConvNeXt model

Language:PythonMIT5673 33 128

glide-text2im

GLIDE: a diffusion-based text-conditional image synthesis model

Language:PythonMIT3514 165 44

BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Language:PythonApache-2.03140 69 260

kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Language:Jupyter NotebookApache-2.02249 42 186

Detic

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".

Language:PythonApache-2.01828 22 102

pix2seq

Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)

Language:Jupyter NotebookApache-2.0847 18 48

detr3d

Language:PythonMIT753 20 70

SLIP

Code release for SLIP Self-supervision meets Language-Image Pre-training

Language:PythonMIT735 18 27

GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

Language:PythonNOASSERTION713 11 64

xcit

Official code Cross-Covariance Image Transformer (XCiT)

Language:PythonApache-2.0653 18 29

DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Language:Python622 19 29

ARKitScenes

This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data and contains the data, scripts to visualize and process assets, and training code described in our paper.

Language:PythonNOASSERTION621 24 61

StyleSDF

Language:PythonNOASSERTION534 29 29

CenterFusion

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

Language:PythonMIT525 10 80

DenseCLIP

[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Language:Python505 3 53

Point-BERT

[CVPR 2022] Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Language:PythonMIT495 11 68

mvit

Code Release for MViTv2 on Image Recognition.

Language:PythonApache-2.0376 14 20

Stratified-Transformer

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Language:PythonMIT355 6 98

ContrastiveSceneContexts

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Language:PythonMIT219 16 37

DemystifyLocalViT

Official code for paper "On the Connection between Local Attention and Dynamic Depth-wise Convolution" ICLR 2022 Spotlight

Language:Jupyter NotebookMIT181 4 3

SWAG

Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.

Language:Jupyter NotebookNOASSERTION170 10 10

CAL

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Language:PythonMIT145 5 24

LiDAR-Distillation

[ECCV 2022] LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Language:PythonApache-2.0104 5 10

FineDiving

FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment

Language:PythonMIT102 3 12