Rohit Gupta's repositories
GPTFast
Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.
rohit-gupta.github.io
Github Pages Repo
htstep
HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos
deep-person-reid
Torchreid: Deep learning person re-identification in PyTorch.
solo-learn
solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning
ml-veclip
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
fsdp_qlora
Training LLMs with QLoRA + FSDP
VidChapters
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
GLIP
Grounded Language-Image Pre-training
GULP
This repository contains the code to replicate the results in the paper: "GULP: a prediction-based metric between representations".
video-captioning-pretrained
Extracting captions from videos using pre-trained BLIP2-like models
TokenReduction
Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT workshop
MMContrast
Project Webpage for Multi-Label Multi-Modal Contrastive Learning paper at CVPR 2023
process-yt8m
scripts to process yt8m dataset
CLAP
Learning audio concepts from natural language supervision
minREV
A simple minimal implementation of Reversible Vision Transformers
nlpaug
Data augmentation for NLP
diffusion-gen
Use stable diffusion to generate images guided by text
pytorch-gram-schmidt
Gram-Schmidt orthogonalization pytorch implementation.
pytorch3d
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
SupContrast
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
image-crop-analysis
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency
simclr-converter
A PyTorch converter for SimCLR checkpoints
youtube8m-data
Extracted YouTube 8M URLs and Labels without all the TF Record parsing/features
BAR
The repository for the official Biased Action Recognition (BAR) dataset for the paper Learning from Failure: Training Debiased Classifier from Biased Classifier (NeurIPS 2020) by Junhyun Nam et al.