Rohit Gupta's repositories
Video2Language
Generating video descriptions using deep learning in Keras
diffusion-gen
Use stable diffusion to generate images guided by text
hardlabel-blackbox-attacks
Papers on black box attacks on hard label models
MMContrast
Project Webpage for Multi-Label Multi-Modal Contrastive Learning paper at CVPR 2023
BAR
The repository for the official Biased Action Recognition (BAR) dataset for the paper Learning from Failure: Training Debiased Classifier from Biased Classifier (NeurIPS 2020) by Junhyun Nam et al.
CLAP
Learning audio concepts from natural language supervision
deep-person-reid
Torchreid: Deep learning person re-identification in PyTorch.
fsdp_qlora
Training LLMs with QLoRA + FSDP
GLIP
Grounded Language-Image Pre-training
GULP
This repository contains the code to replicate the results in the paper: "GULP: a prediction-based metric between representations".
image-crop-analysis
Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency
minREV
A simple minimal implementation of Reversible Vision Transformers
ml-veclip
The official repo for the paper "VeCLIP: Improving CLIP Training via Visual-enriched Captions"
nlpaug
Data augmentation for NLP
process-yt8m
scripts to process yt8m dataset
pytorch-gram-schmidt
Gram-Schmidt orthogonalization pytorch implementation.
pytorch3d
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
rohit-gupta.github.io
Github Pages Repo
simclr-converter
A PyTorch converter for SimCLR checkpoints
solo-learn
solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning
SupContrast
PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
TokenReduction
Official PyTorch implementation of Which Tokens to Use? Investigating Token Reduction in Vision Transformers presented at ICCV 2023 NIVT workshop
video-captioning-pretrained
Extracting captions from videos using pre-trained BLIP2-like models
wsolevaluation
Evaluating Weakly Supervised Object Localization Methods Right (CVPR 2020)
youtube8m-data
Extracted YouTube 8M URLs and Labels without all the TF Record parsing/features