Kerem Turgutlu's starred repositories

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonLicense:Apache-2.0Stargazers:35761Issues:347Issues:1726

autogen

A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap

Language:Jupyter NotebookLicense:CC-BY-4.0Stargazers:28497Issues:361Issues:1447

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23064Issues:211Issues:3472

marker

Convert PDF to markdown quickly with high accuracy

Language:PythonLicense:GPL-3.0Stargazers:14428Issues:62Issues:170

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:10494Issues:85Issues:295

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:10234Issues:104Issues:18

text-generation-inference

Large Language Model Text Generation Inference

Language:PythonLicense:Apache-2.0Stargazers:8404Issues:99Issues:1207
Language:Jupyter NotebookLicense:MITStargazers:8235Issues:74Issues:30

FastUI

Build better UIs faster.

Language:PythonLicense:MITStargazers:7858Issues:62Issues:200

TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

Language:C++License:Apache-2.0Stargazers:7497Issues:87Issues:1590

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:6706Issues:50Issues:248

openchat

OpenChat: Advancing Open-source Language Models with Imperfect Data

Language:PythonLicense:Apache-2.0Stargazers:5143Issues:51Issues:185

marvin

✨ Build AI interfaces that spark joy

Language:PythonLicense:Apache-2.0Stargazers:5012Issues:36Issues:200

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonLicense:MITStargazers:4396Issues:49Issues:285

alignment-handbook

Robust recipes to align language models with human and AI preferences

Language:PythonLicense:Apache-2.0Stargazers:4231Issues:112Issues:124

helium

Lighter web automation with Python

Language:PythonLicense:MITStargazers:4164Issues:77Issues:91

python-pptx

Create Open XML PowerPoint documents in Python

Language:PythonLicense:MITStargazers:2279Issues:76Issues:861

voyager

🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.

Language:C++License:Apache-2.0Stargazers:1227Issues:14Issues:24

Xwin-LM

Xwin-LM: Powerful, Stable, and Reproducible LLM Alignment

awesome-mojo

A curated list of awesome Mojo 🔥 frameworks, libraries, software and resources

texify

Math OCR model that outputs LaTeX and markdown

Language:PythonLicense:GPL-3.0Stargazers:631Issues:7Issues:8

hqq

Official implementation of Half-Quadratic Quantization (HQQ)

Language:PythonLicense:Apache-2.0Stargazers:569Issues:14Issues:72

text2sql-data

A collection of datasets that pair questions with SQL queries.

Language:PythonLicense:NOASSERTIONStargazers:524Issues:18Issues:38

NEFTune

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Language:PythonLicense:MITStargazers:348Issues:11Issues:14

shoggoth

Shoggoth is a peer-to-peer network for publishing and distributing open-source Artificial Intelligence

Language:RustLicense:MITStargazers:228Issues:2Issues:19

LightSeq

Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:159Issues:7Issues:0

multipack_sampler

Multipack distributed sampler for fast padding-free training of LLMs

Language:PythonLicense:MITStargazers:159Issues:3Issues:3

optimi

Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers

Language:PythonLicense:MITStargazers:40Issues:2Issues:1