vishaal27

Vishaal Udandarao's starred repositories

llama3

The official Meta Llama 3 GitHub site

Language:PythonNOASSERTION25706 211 229

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookNOASSERTION11404 91 315

lm-evaluation-harness

A framework for few-shot evaluation of language models.

Language:PythonMIT6217 35 1020

GroundingDINO

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

Language:PythonApache-2.06074 37 292

mteb

MTEB: Massive Text Embedding Benchmark

Language:Jupyter NotebookApache-2.01747 11 378

llm2vec

Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'

Language:PythonMIT1076 18 105

LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Language:Python783 10 32

arena-hard-auto

Arena-Hard-Auto: An automatic LLM benchmark.

Language:Jupyter NotebookApache-2.0381 5 24

better_profanity

Blazingly fast cleaning swear words (and their leetspeak) in strings

Language:PythonMIT202 6 34

onetwo

Language:PythonApache-2.0158 14 4

gogetcrawl

Extract web archive data using Wayback Machine and Common Crawl

Language:GoMIT141 5 1

ID-Aligner

Official implement of ID-Aligner

115 18 1

SemDeDup

Code for "SemDeDup", a simple method for identifying and removing semantic duplicates from a dataset (data pairs which are semantically similar, but not exactly identical).

Language:PythonNOASSERTION97 3 9