Wufei Ma's repositories
imagenet3d
ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
imagenet3d_exp
Code to reproduce baseline results on ImageNet3D.
OOD-CV-Data
OOD-CV dataset and data preprocessing tools
3d-annotator
:pencil2: Web-based image segmentation tool for object detection, localization, and keypoints
aim
Aim 💫 — easy-to-use and performant open-source ML experiment tracker.
CogVideo
Text-to-video generation. The repo for ICLR2023 paper "CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers"
ConvNeXt
Code release for ConvNeXt model
DCVC
Deep Contextual Video Compression
Grounded-Segment-Anything
Grounded-SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
InternVideo
InternVideo: General Video Foundation Models via Generative and Discriminative Learning (https://arxiv.org/abs/2212.03191)
mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
omni3d
Code release for "Omni3D A Large Benchmark and Model for 3D Object Detection in the Wild"
open_clip
An open source implementation of CLIP.
SimVTP
SimVTP: This repo is the official implementation of "Simple Video Text Pre-training with Masked Autoencoders"
videocomposer
Official repo for VideoComposer: Compositional Video Synthesis with Motion Controllability