Kerem Turgutlu's starred repositories

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:35505Issues:346Issues:1716

autogen

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:27916Issues:358Issues:1383

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:21844Issues:196Issues:3232

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:13590Issues:56Issues:154

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10304Issues:82Issues:291

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:10133Issues:103Issues:18

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8309Issues:97Issues:1169
Language:Jupyter NotebookLicense:MITStargazers:8216Issues:75Issues:30

FastUI

Build better UIs faster.

Language:PythonLicense:MITStargazers:7761Issues:62Issues:200

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7340Issues:83Issues:1521

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:6441Issues:47Issues:233

openchat

OpenChat: Advancing Open-source Language Models with Imperfect Data

Language:PythonLicense:Apache-2.0Stargazers:5110Issues:51Issues:185

marvin

✨ Build AI interfaces that spark joy

Language:PythonLicense:Apache-2.0Stargazers:4957Issues:36Issues:200

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonLicense:MITStargazers:4385Issues:49Issues:284

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4155Issues:112Issues:122

helium

Lighter web automation for Python

Language:PythonLicense:MITStargazers:4154Issues:77Issues:91

python-pptx

Create Open XML PowerPoint documents in Python

Language:PythonLicense:MITStargazers:2251Issues:76Issues:860

voyager

🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.

Language:C++License:Apache-2.0Stargazers:1211Issues:13Issues:23

Xwin-LM

Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment

awesome-mojo

A curated list of awesome Mojo 🔥 frameworks, libraries, software and resources

texify

Math OCR model that outputs LaTeX and markdown

Language:PythonLicense:GPL-3.0Stargazers:588Issues:7Issues:8

hqq

Official implementation of Half-Quadratic Quantization (HQQ)

Language:PythonLicense:Apache-2.0Stargazers:550Issues:14Issues:72

text2sql-data

A collection of datasets that pair questions with SQL queries.

Language:PythonLicense:NOASSERTIONStargazers:521Issues:18Issues:38

NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Language:PythonLicense:MITStargazers:345Issues:11Issues:14

shoggoth

Shoggoth is a peer-to-peer network for publishing and distributing open-source Artificial Intelligence

Language:RustLicense:MITStargazers:227Issues:2Issues:19

LightSeq

Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:158Issues:7Issues:0

multipack_sampler

Multipack distributed sampler for fast padding-free training of LLMs

Language:PythonLicense:MITStargazers:157Issues:3Issues:3

optimi

Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers

Language:PythonLicense:MITStargazers:33Issues:2Issues:1