vishaal27

Vishaal Udandarao's starred repositories

matmulfreellm

Implementation for MatMul-free LM.

Language:PythonApache-2.02559 39 20

ml-mobileclip

This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024

Language:PythonNOASSERTION464 150

OmniCorpus

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

153 8 2

dclm

DataComp for Language Models

Language:HTMLMIT14600

Recap-DataComp-1B

This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"

98 4 8

MINT-1T

MINT-1T: A one trillion token multimodal interleaved dataset.

86 15 2

OCRDatasets

A collection of OCR-related datasets

72 10

RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or images are included in this dataset).

63 6 7

search-agents

Code for the paper 🌳 Tree Search for Language Model Agents

Language:PythonMIT61 1 1

goldfish-loss

Official implementation of Goldfish Loss: Mitigating Memorization in Generative LLMs

Language:PythonApache-2.05400

ReNO

ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

Language:PythonMIT50 6 2

schedules-and-scaling

Language:PythonMIT37 40

dangerous-capability-evaluations

Language:DockerfileApache-2.033 60

svo_probes

The SVO-Probes Dataset for Verb Understanding

Language:Jupyter NotebookApache-2.028 5 1

Flickr30k-Image-Viewer

Small Flask-based apps to browse the Flickr30k dataset.

Language:PythonApache-2.020 40

pointingqa

Code for paper "Point and Ask: Incorporating Pointing into Visual Question Answering"

Language:Python18 3 2

videophy

Video Generation, Physical Commonsense, Semantic Adherence, VideoCon-Physics

Language:PythonMIT1700

llm_dataset_inference

Official Repository for Dataset Inference for LLMs

Language:PythonMIT16 1 1

clip-beyond-tail

Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights

Language:Jupyter NotebookMIT14 20

concadia

Language:PythonNOASSERTION14 20

MPS

Language:HTML12 1 1

CLoG

✌ CLoG: Benchmarking Continual Learning of Image Generation Models

Language:PythonMIT11 20

icai

Inverse Constitutional AI: compressing pairwise preference data into a short constitution of principles.

Language:PythonApache-2.07 20

foildataset

Experiments on Foil Dataset

Language:Jupyter Notebook7 60

vl_compo

7 20

training-cost-trends

Language:Jupyter NotebookApache-2.0500

ilid

Industrial Language-Image Dataset (ILID), a web-crawled dataset containing language-image samples from various web catalogs, representing parts/components from the industrial domain.

Language:PythonApache-2.05 20

vl-probing

Language:PythonMIT3 2 1

BiVLC

Language:PythonMIT2 10

BLA

Benchmark for Basic Language Abilities of Multimodal Pretrained Transformers

Language:Jupyter NotebookMIT2 10