Doohae Jung (wavy-jung)

wavy-jung

Geek Repo

Company:@kakao

Location:Seoul, Korea

Github PK Tool:Github PK Tool

Doohae Jung's starred repositories

awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

Stargazers:906Issues:0Issues:0

JetMoE

Reaching LLaMA2 Performance with 0.1M Dollars

Language:PythonLicense:Apache-2.0Stargazers:955Issues:0Issues:0

GLMKD

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method ; GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model

Language:PythonLicense:MITStargazers:31Issues:0Issues:0

repo-level-codegen-papers

Repo-Level Code generation papers

Stargazers:58Issues:0Issues:0

tree-sitter

An incremental parsing system for programming tools

Language:RustLicense:MITStargazers:17855Issues:0Issues:0

devdocs

API Documentation Browser

Language:RubyLicense:MPL-2.0Stargazers:34857Issues:0Issues:0

axlearn

An Extensible Deep Learning Library

Language:PythonLicense:Apache-2.0Stargazers:1750Issues:0Issues:0

Yuan2.0-M32

Mixture-of-Experts (MoE) Language Model

Language:PythonLicense:Apache-2.0Stargazers:176Issues:0Issues:0

learning_ray

Notebooks for the O'Reilly book "Learning Ray"

Language:Jupyter NotebookLicense:MITStargazers:241Issues:0Issues:0

llama-agentic-system

Agentic components of the Llama Stack APIs

Language:PythonLicense:NOASSERTIONStargazers:3090Issues:0Issues:0

open_lm

A repository for research on medium sized language models.

Language:PythonLicense:MITStargazers:463Issues:0Issues:0

faker

Faker is a Python package that generates fake data for you.

Language:PythonLicense:MITStargazers:17526Issues:0Issues:0

LLM101n

LLM101n: Let's build a Storyteller

Stargazers:27537Issues:0Issues:0

NeMo-Aligner

Scalable toolkit for efficient model alignment

Language:PythonLicense:Apache-2.0Stargazers:498Issues:0Issues:0

datacomp

DataComp: In search of the next generation of multimodal datasets

Language:PythonLicense:NOASSERTIONStargazers:631Issues:0Issues:0

dclm

DataComp for Language Models

Language:HTMLLicense:MITStargazers:1084Issues:0Issues:0

NeMo-Framework-Launcher

Provides end-to-end model development pipelines for LLMs and Multimodal models that can be launched on-prem or cloud-native.

Language:PythonLicense:Apache-2.0Stargazers:441Issues:0Issues:0

Skywork-MoE

Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models

Stargazers:120Issues:0Issues:0

text-clustering

Easily embed, cluster and semantically label text datasets

Language:PythonLicense:Apache-2.0Stargazers:421Issues:0Issues:0

maxtext

A simple, performant and scalable Jax LLM!

Language:PythonLicense:Apache-2.0Stargazers:1420Issues:0Issues:0

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:14862Issues:0Issues:0

SimPO

SimPO: Simple Preference Optimization with a Reference-Free Reward

Language:PythonLicense:MITStargazers:615Issues:0Issues:0

dbrx

Code examples and resources for DBRX, a large language model developed by Databricks

Language:PythonLicense:NOASSERTIONStargazers:2495Issues:0Issues:0

mistral-inference

Official inference library for Mistral models

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:9458Issues:0Issues:0

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Language:PythonLicense:Apache-2.0Stargazers:832Issues:0Issues:0

llm-swarm

Manage scalable open LLM inference endpoints in Slurm clusters

Language:PythonLicense:MITStargazers:216Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:408Issues:0Issues:0

Bend

A massively parallel, high-level programming language

Language:RustLicense:Apache-2.0Stargazers:17125Issues:0Issues:0

llama-recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a number of candid inference solutions such as HF TGI, VLLM for local or cloud deployment. Demo apps to showcase Meta Llama3 for WhatsApp & Messenger.

Language:Jupyter NotebookStargazers:11443Issues:0Issues:0

distilabel

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

Language:PythonLicense:Apache-2.0Stargazers:1311Issues:0Issues:0