Vishaal Udandarao (vishaal27)

vishaal27

Geek Repo

Company:University of Tübingen | University of Cambridge

Location:Tübingen, Germany

Home Page:https://vishaal27.github.io/

Twitter:@vishaal_urao

Github PK Tool:Github PK Tool

Vishaal Udandarao's starred repositories

tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Language:PythonLicense:MITStargazers:11585Issues:167Issues:229

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonLicense:NOASSERTIONStargazers:2492Issues:40Issues:23

img2img-turbo

One-step image-to-image with Stable Diffusion turbo: sketch2image, day2night, and more

Language:PythonLicense:MITStargazers:1409Issues:18Issues:74

mup

maximal update parametrization (µP)

Language:Jupyter NotebookLicense:MITStargazers:1319Issues:29Issues:61

evolutionary-model-merge

Official repository of Evolutionary Optimization of Model Merging Recipes

Language:PythonLicense:Apache-2.0Stargazers:1168Issues:40Issues:11

lilac

Curate better data for LLMs

Language:PythonLicense:Apache-2.0Stargazers:924Issues:13Issues:291

Mind2Web

[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"

Language:Jupyter NotebookLicense:MITStargazers:644Issues:22Issues:41

Long-CLIP

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Language:PythonLicense:Apache-2.0Stargazers:559Issues:12Issues:64

HPT

HPT - Open Multimodal LLMs from HyperGAI

Language:PythonLicense:Apache-2.0Stargazers:305Issues:7Issues:11

scaling_on_scales

When do we not need larger vision models?

Language:PythonLicense:MITStargazers:299Issues:7Issues:14

LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents

ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Language:PythonLicense:Apache-2.0Stargazers:234Issues:11Issues:11

LLM-SLERP-Merge

Spherical Merge Pytorch/HF format Language Models with minimal feature loss.

ml-tic-clip

Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".

Language:PythonLicense:NOASSERTIONStargazers:88Issues:15Issues:0

Visual-CoT

Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Language:PythonLicense:Apache-2.0Stargazers:84Issues:1Issues:6

Chain-of-Spot

Chain-of-Spot: Interactive Reasoning Improves Large Vision-language Models

Language:PythonLicense:Apache-2.0Stargazers:80Issues:5Issues:7

DreamLIP

[ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions

Language:PythonLicense:NOASSERTIONStargazers:78Issues:8Issues:8

attention-interpolation-diffusion

Interpolation Between Text-to-Image Generation!

Inflection-Benchmarks

Public Inflection Benchmarks

skerch

Sketched matrix decompositions for PyTorch

Language:PythonLicense:MITStargazers:63Issues:2Issues:2

mnms

m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks

modelgauge

Make it easy to automatically and uniformly measure the behavior of many AI Systems.

Language:PythonLicense:Apache-2.0Stargazers:25Issues:17Issues:121

VL-ICL

Code for paper: VL-ICL Bench: The Devil in the Details of Benchmarking Multimodal In-Context Learning

Meta-Prompting

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs (ECCV 2024)

Language:PythonLicense:MITStargazers:11Issues:3Issues:1

visual_diversity_budget

Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost

clipcov-data-efficient-clip

Code for AISTATS Eff MML

Language:Jupyter NotebookStargazers:7Issues:0Issues:0