ASU Thesis Format
The low-level, core functionality of boto 3.
Contrastive Multiview Coding
Reference code for the paper CAMS: Color-Aware Multi-Style Transfer.
Extracting optical flow and frames
Flickr30K Entities Dataset
code for our CVPR2020 paper "IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval"
Learning phrase grounding from captioned images through InfoNCE bound on mutual information
Python notebooks with ML and deep learning examples with Azure Machine Learning | Microsoft
Markdown content for the www.aerobatic.io website
Oscar and VinVL
A deep learning library for video understanding research.
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Video Swin Transformer - PyTorch