Beast code in Giters

Zongyang Ma's repositories

VinVL

project page for VinVL

000

HierKD

Language:PythonApache-2.03400

GRIT-VLP

This is an official implementation of GRIT-VLP

000

CogVideo

Text-to-video generation.

Apache-2.0000

Clover

Offical PyTorch implementation of Clover: Towards A Unified Video-Language Alignment and Fusion Model

000

X-VLM

X-VLM: Multi-Grained Vision Language Pre-Training (ICML 2022)

BSD-3-Clause000

LAVENDER

A Unified Framework for Video-Language Understanding

MIT000

UVLP

CVPR 2022 (Oral) Pytorch Code for Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment

NOASSERTION000

pytorch_violet

A PyTorch implementation of VIOLET

000

all-in-one

[Arxiv2022] All in One: Exploring Unified Video-Language Pre-training

000

SLIP

Code release for SLIP Self-supervision meets Language-Image Pre-training

MIT000

ml-cvnets

CVNets: A library for training computer vision networks

NOASSERTION000

RETRO-pytorch

Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch

Apache-2.0000

BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

BSD-3-Clause000

ALBEF

Code for ALBEF: a new vision-language pre-training method

BSD-3-Clause000

mae

PyTorch implementation of MAE https//arxiv.org/abs/2111.06377

NOASSERTION000

DenseCLIP

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting

100

mvits_for_class_agnostic_od

Apache-2.0000

SoftTeacher

Semi-Supervised Learning, Object Detection, ICCV2021

MIT000

MAE-pytorch

Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners

000

frozen-in-time

Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]

000

ReIR-WeaklyGrounding.pytorch

The official PyTorch code for "Relation-aware Instance Refinement for Weakly Supervised Visual Grounding" accepted by CVPR2021

000

TOOD

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

Apache-2.0000

CoOp

Learning to Prompt for Vision-Language Models.

MIT000

ovr-cnn

A new framework for open-vocabulary object detection, based on maskrcnn-benchmark

MIT000

OWOD

(CVPR 2021 Oral) Open World Object Detection

Apache-2.0000

Zero-Shot-Detection-via-Vision-and-Language-Knowledge-Distillation

000

ReduNet

000

simple-faster-rcnn-pytorch

A simplified implemention of Faster R-CNN that replicate performance from origin paper

NOASSERTION000

TransT

Transformer Tracking (CVPR2021)

GPL-3.0000