Akash Mahajan (akashmjn)

akashmjn

Geek Repo

Company:@ContextualAI

Location:Redwood City, CA

Home Page:https://akashmjn.me

Github PK Tool:Github PK Tool

Akash Mahajan's starred repositories

segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:46871Issues:305Issues:662

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server, mobile, embedded and IoT devices)

Language:PythonLicense:Apache-2.0Stargazers:42891Issues:440Issues:9264

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:36543Issues:348Issues:1771

faiss

A library for efficient similarity search and clustering of dense vectors.

llama3

The official Meta Llama 3 GitHub site

Language:PythonLicense:NOASSERTIONStargazers:26310Issues:216Issues:242

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Language:PythonLicense:Apache-2.0Stargazers:19487Issues:160Issues:1492

mlx

MLX: An array framework for Apple silicon

redis-py

Redis Python client

Language:PythonLicense:MITStargazers:12562Issues:326Issues:1711

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9914Issues:86Issues:131

trl

Train transformer language models with reinforcement learning.

Language:PythonLicense:Apache-2.0Stargazers:9371Issues:74Issues:1120

minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Language:PythonLicense:MITStargazers:9065Issues:83Issues:36

promptbase

All things prompt engineering

Language:PythonLicense:MITStargazers:5363Issues:59Issues:13

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonLicense:Apache-2.0Stargazers:5300Issues:53Issues:535

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4515Issues:107Issues:133

gpu.cpp

A lightweight library for portable low-level GPU computation using WebGPU.

Language:C++License:Apache-2.0Stargazers:3682Issues:45Issues:21

deepdoctection

A Repo For Document AI

Language:PythonLicense:Apache-2.0Stargazers:2517Issues:18Issues:180

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:2313Issues:22Issues:178

webdataset

A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.

Language:PythonLicense:BSD-3-ClauseStargazers:2223Issues:22Issues:322

whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

Language:PythonLicense:MITStargazers:1810Issues:36Issues:94

S-LoRA

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Language:PythonLicense:Apache-2.0Stargazers:1713Issues:24Issues:38

HALOs

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

Language:PythonLicense:Apache-2.0Stargazers:706Issues:7Issues:20

gritlm

Generative Representational Instruction Tuning

Language:Jupyter NotebookLicense:MITStargazers:536Issues:8Issues:47

tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Language:PythonLicense:Apache-2.0Stargazers:490Issues:10Issues:96

unitable

UniTable: Towards a Unified Table Foundation Model

Language:Jupyter NotebookLicense:MITStargazers:346Issues:9Issues:29
Language:PythonLicense:NOASSERTIONStargazers:289Issues:13Issues:0

tamil-llama

A New Tamil Large Language Model (LLM) Based on Llama 2

Language:PythonLicense:GPL-3.0Stargazers:255Issues:12Issues:11

stopes

A library for preparing data for machine translation research (monolingual preprocessing, bitext mining, etc.) built by the FAIR NLLB team.

Language:PythonLicense:MITStargazers:247Issues:20Issues:40

chug

Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.

Language:PythonLicense:Apache-2.0Stargazers:151Issues:10Issues:3

vidore-benchmark

Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.

Language:PythonLicense:MITStargazers:103Issues:3Issues:1

openhathi_instruct

This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resulting model is meant to follow instructions and chat in Hindi and Hinglish.

Language:Jupyter NotebookStargazers:23Issues:4Issues:4