Zongyang Ma's repositories
VinVL
project page for VinVL
GRIT-VLP
This is an official implementation of GRIT-VLP
CogVideo
Text-to-video generation.
Clover
Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model
X-VLM
X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)
LAVENDER
A Unified Framework for Video-Language Understanding
UVLP
CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
pytorch_violet
A PyTorch implementation of VIOLET
all-in-one
[Arxiv2022] All in One: Exploring Unified Video-Language Pre-training
SLIP
Code release for SLIP Self-supervision meets Language-Image Pre-training
ml-cvnets
CVNets: A library for training computer vision networks
RETRO-pytorch
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
ALBEF
Code for ALBEF: a new vision-language pre-training method
mae
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
SoftTeacher
Semi-Supervised Learning, Object Detection, ICCV2021
MAE-pytorch
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
frozen-in-time
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
ReIR-WeaklyGrounding.pytorch
The official PyTorch code for "Relation-aware Instance Refinement for Weakly Supervised Visual Grounding" accepted by CVPR2021
TOOD
TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral
CoOp
Learning to Prompt for Vision-Language Models.
ovr-cnn
A new framework for open-vocabulary object detection, based on maskrcnn-benchmark
OWOD
(CVPR 2021 Oral) Open World Object Detection
simple-faster-rcnn-pytorch
A simplified implemention of Faster R-CNN that replicate performance from origin paper
TransT
Transformer Tracking (CVPR2021)