Yongming Rao's starred repositories

instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Language:CudaLicense:NOASSERTIONStargazers:15697Issues:205Issues:1010

open_clip

An open source implementation of CLIP.

Language:PythonLicense:NOASSERTIONStargazers:9468Issues:77Issues:458

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

Language:PythonLicense:NOASSERTIONStargazers:7053Issues:58Issues:187

dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Language:PythonLicense:Apache-2.0Stargazers:6137Issues:68Issues:246

ConvNeXt

Code release for ConvNeXt model

Language:PythonLicense:MITStargazers:5673Issues:33Issues:128

glide-text2im

GLIDE: a diffusion-based text-conditional image synthesis model

Language:PythonLicense:MITStargazers:3514Issues:165Issues:44

BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

Language:PythonLicense:Apache-2.0Stargazers:3140Issues:69Issues:260

kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2249Issues:42Issues:186

Detic

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".

Language:PythonLicense:Apache-2.0Stargazers:1828Issues:22Issues:102

pix2seq

Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:847Issues:18Issues:48
Language:PythonLicense:MITStargazers:753Issues:20Issues:70

SLIP

Code release for SLIP Self-supervision meets Language-Image Pre-training

Language:PythonLicense:MITStargazers:735Issues:18Issues:27

GroupViT

Official PyTorch implementation of GroupViT: Semantic Segmentation Emerges from Text Supervision, CVPR 2022.

Language:PythonLicense:NOASSERTIONStargazers:713Issues:11Issues:64

xcit

Official code Cross-Covariance Image Transformer (XCiT)

Language:PythonLicense:Apache-2.0Stargazers:653Issues:18Issues:29

DeCLIP

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

ARKitScenes

This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data and contains the data, scripts to visualize and process assets, and training code described in our paper.

Language:PythonLicense:NOASSERTIONStargazers:621Issues:24Issues:61
Language:PythonLicense:NOASSERTIONStargazers:534Issues:29Issues:29

CenterFusion

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

Language:PythonLicense:MITStargazers:525Issues:10Issues:80

DenseCLIP

[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

Point-BERT

[CVPR 2022] Pre-Training 3D Point Cloud Transformers with Masked Point Modeling

Language:PythonLicense:MITStargazers:495Issues:11Issues:68

mvit

Code Release for MViTv2 on Image Recognition.

Language:PythonLicense:Apache-2.0Stargazers:376Issues:14Issues:20

Stratified-Transformer

Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Language:PythonLicense:MITStargazers:355Issues:6Issues:98

ContrastiveSceneContexts

Code for CVPR 2021 oral paper "Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts"

Language:PythonLicense:MITStargazers:219Issues:16Issues:37

DemystifyLocalViT

Official code for paper "On the Connection between Local Attention and Dynamic Depth-wise Convolution" ICLR 2022 Spotlight

Language:Jupyter NotebookLicense:MITStargazers:181Issues:4Issues:3

SWAG

Official repository for "Revisiting Weakly Supervised Pre-Training of Visual Perception Models". https://arxiv.org/abs/2201.08371.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:170Issues:10Issues:10

CAL

[ICCV 2021] Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-identification

Language:PythonLicense:MITStargazers:145Issues:5Issues:24

LiDAR-Distillation

[ECCV 2022] LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

Language:PythonLicense:Apache-2.0Stargazers:104Issues:5Issues:10

FineDiving

FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment

Language:PythonLicense:MITStargazers:102Issues:3Issues:12