Beast code in Giters

Seonghwan Kim's starred repositories

ML-Papers-of-the-Week

🔥Highlighting the top ML papers every week.

9766 828 3

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonApache-2.05230 39 37

transformer-debugger

Language:PythonMIT3995 26 14

langroid

Harness LLMs with Multi-Agent Programming

Language:PythonMIT2195 17 151

schedule_free

Schedule-Free Optimization in PyTorch

Language:PythonApache-2.01770 17 26

penzai

A JAX research toolkit for building, editing, and visualizing neural networks.

Language:PythonApache-2.01615 18 15

simple-evals

Language:PythonMIT1460 25 9

awesome-llm-powered-agent

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

MIT1313 41 6

nanotron

Minimalistic large language model 3D-parallelism training

Language:PythonApache-2.01052 42 72

SPIN

The official implementation of Self-Play Fine-Tuning (SPIN)

Language:PythonApache-2.0928 12 30

DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤

Language:PythonMIT771 8 26

LongBench

[ACL 2024] LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

Language:PythonMIT587 6 66

RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Language:PythonApache-2.0574 19 23

ringattention

Transformers with Arbitrarily Large Context

Language:PythonApache-2.0571 5 15

ring-flash-attention

Ring attention implementation with flash attention

Language:Python499 9 28

textbook_quality

Generate textbook-quality synthetic LLM pretraining data

Language:PythonMIT473 8 6

large_language_model_training_playbook

An open collection of implementation tips, tricks and resources for training large language models

Language:PythonApache-2.0451 690

InternEvo

Language:PythonApache-2.0261 8 77

BlackMamba

Code repository for Black Mamba

Language:Python212 4 7

terashuf

terashuf shuffles multi-terabyte text files using limited memory

Language:C++MIT201 5 9

LongAlign

LongAlign: A Recipe for Long Context Alignment Encompassing Data, Training, and Evaluation

Language:PythonApache-2.0183 8 9

miracl

A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.

Apache-2.0155 7 14

LogicKor

한국어 언어모델 다분야 사고력 벤치마크

Language:Python142 1 12

scalax

A simple library for scaling up JAX programs

Language:PythonApache-2.0114 70

RethinkTinyLM

[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”

Language:Python113 4 4

Inflection-Benchmarks

Public Inflection Benchmarks

MIT67 6 4

ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search

Language:Python67 3 2

prepacking

The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models"

Language:Jupyter Notebook55 2 1

sparse_feedback

Language:PythonMIT28 10

ko-rm-judge

Reward Model을 이용하여 언어모델의 답변을 평가하기

Language:PythonMIT26 20