Beast code in Giters

Linjie Li's repositories

HERO

Research code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

Language:PythonMIT231 7 48

VQA_ReGAT

Research Code for ICCV 2019 paper "Relation-aware Graph Attention Network for Visual Question Answering"

Language:PythonMIT182 6 41

HERO_Video_Feature_Extractor

Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"

Language:PythonMIT107 3 8

VALUE

Video And Language Understanding Evaluation

Language:Python2 30

attrEXP

attractiveness experiments on Amazon MTurk

Language:JavaScript020

bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

Language:Jupyter NotebookMIT010

cc

Creative Commons copyright license files

Language:HTML000

diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch

Language:PythonApache-2.0000

merlot-1

MERLOT: Multimodal Neural Script Knowledge Models

Language:PythonMIT000

MIL-NCE_HowTo100M

PyTorch GPU distributed training code for MIL-NCE HowTo100M

Language:PythonApache-2.0000

pythia

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

Language:PythonNOASSERTION010

seada-vqa

A pytorch implemetation of data augmentation method for visual question answering

Language:PythonMIT000

simi_pair

Language:Matlab040

SlowFast

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Language:PythonApache-2.0010

TVRetrieval

PyTorch implementation of XML on TVR dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval

Language:PythonMIT010

vqa2vln-tutorial.github.io

Language:CSSCC0-1.0000

X-Decoder

[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language

Language:PythonMIT000