wavy-jung

Doohae Jung's starred repositories

awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonApache-2.095500

GLMKD

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model

Language:PythonMIT3100

tree-sitter

An incremental parsing system for programming tools

Language:RustMIT1785500

devdocs

API Documentation Browser

Language:RubyMPL-2.03485700

axlearn

An Extensible Deep Learning Library

Language:PythonApache-2.0175000

Yuan2.0-M32

Mixture-of-Experts (MoE) Language Model

Language:PythonApache-2.017600

learning_ray

Notebooks for the O'Reilly book "Learning Ray"

Language:Jupyter NotebookMIT24100

llama-agentic-system

Agentic components of the Llama Stack APIs

Language:PythonNOASSERTION309000

open_lm

A repository for research on medium sized language models.

Language:PythonMIT46300

faker

Faker is a Python package that generates fake data for you.

Language:PythonMIT1752600

LLM101n

LLM101n: Let's build a Storyteller

2753700

NeMo-Aligner

Scalable toolkit for efficient model alignment

Language:PythonApache-2.049800

datacomp

DataComp: In search of the next generation of multimodal datasets

Language:PythonNOASSERTION63100

dclm

DataComp for Language Models

Language:HTMLMIT108400

NeMo-Framework-Launcher

Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.

Language:PythonApache-2.044100

Skywork-MoE

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

12000

text-clustering

Easily embed, cluster and semantically label text datasets

Language:PythonApache-2.042100

maxtext

A simple, performant and scalable Jax LLM!

Language:PythonApache-2.0142000

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.01486200

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:PythonMIT61500

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonNOASSERTION249500

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookApache-2.0945800

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonApache-2.083200

llm-swarm

Manage scalable open LLM inference endpoints in Slurm clusters

Language:PythonMIT21600

Bend

A massively parallel, high-level programming language

Language:RustApache-2.01712500

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter Notebook1144300

distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Language:PythonApache-2.0131100