Linjie Li's repositories
HERO_Video_Feature_Extractor
Video Feature Extraction Code for EMNLP 2020 paper "HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training"
bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
cc
Creative Commons copyright license files
diffusers
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
merlot-1
MERLOT: Multimodal Neural Script Knowledge Models
MIL-NCE_HowTo100M
PyTorch GPU distributed training code for MIL-NCE HowTo100M
seada-vqa
A pytorch implemetation of data augmentation method for visual question answering
TVRetrieval
PyTorch implementation of XML on TVR dataset - TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
X-Decoder
[CVPR 2023] Official Implementation of X-Decoder for generalized decoding for pixel, image and language