moe

There are 11 repositories under moe topic.

LLaMA-Factory
hiyouga / LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
fine-tuning llama llm peft transformers rlhf qlora quantization qwen instruction-tuning gpt lora large-language-models agent ai moe llama3 deepseek gemma nlp
Language:Python 58292
sgl-project / sglang
SGLang is a fast serving framework for large language models and vision language models.
cuda inference llama llava llm llm-serving moe pytorch transformer vlm llama3 deepseek deepseek-v3 deepseek-r1 qwen3 llama4 blackwell openai kimi gpt-oss
Language:Python 17925
Bangumi
czy0729 / Bangumi
:electron: An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。目前已适配 iOS / Android。
react-native mobx ios-app react ios android android-app bangumi design expo moe
Language:TypeScript 4778
PKU-YuanGroup / MoE-LLaVA
【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models
large-vision-language-model mixture-of-experts moe multi-modal
Language:Python 2239
MoonshotAI / MoBA
MoBA: Mixture of Block Attention for Long-Context LLMs
flash-attention llm llm-serving llm-training moe pytorch transformer
Language:Python 1725
davidmrau / mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
mixture-of-experts moe pytorch re-implementation sparsely-gated-mixture-of-experts
Language:Python 1089
pjlab-sys4nlp / llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
continual-pre-training expert-partition llama llm mixture-of-experts moe
Language:Python 987
microsoft / Tutel
Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4
pytorch moe mixture-of-experts deepseek llm
Language:Python 797
sail-sg / Adan
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
adan bert-model convnext deep-learning fairseq mae optimizer resnet timm vit transformer-xl artificial-intelligence diffusion dreamfusion gpt2 pytorch cuda-programming llm-training llms moe
Language:Python 796
open-compass / MixtralKit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
llm mistral moe
Language:Python 766
ScienceOne-AI / DeepSeek-671B-SFT-Guide
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案，包含从训练到推理的完整代码和脚本，以及实践中积累一些经验和结论。)
deepseek-r1 llm moe python sft
Language:Python 619
ymcui / Chinese-Mixtral
中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）
32k 64k large-language-models llm mixtral mixture-of-experts moe nlp
Language:Python 603
mindspore-courses / step_into_llm
MindSpore online courses: Step into LLM
llm natural-language-processing nlp large-language-models mindspore bert chatgpt codegeex gpt gpt2 instruction-tuning parallel-computing prompt-tuning rlhf chatglm chatglm2 llama llama2 moe peft
Language:Jupyter Notebook 477
kokororin / pixiv.moe
😘 A pinterest-style layout site, shows illusts on pixiv.net order by popularity.
lovelive pixiv illusts react moe redux comic comics illust typescript website webapp
Language:TypeScript 364
SkyworkAI / MoH
MoH: Multi-Head Attention as Mixture-of-Head Attention
attention dit llms mixture-of-experts moe transformer vit
Language:Python 272
LISTEN-moe / android-app
Official LISTEN.moe Android app
android android-auto anime japan jpop kotlin kpop moe music music-player
Language:Kotlin 261
inferflow / inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
baichuan2 bloom deepseek falcon gemma internlm llama2 llamacpp llm-inference m2m100 minicpm mistral mixtral mixture-of-experts model-quantization moe multi-gpu-inference phi-2 qwen
Language:C++ 241
SkyworkAI / MoE-plus-plus
[ICLR 2025] MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts
large-language-models llms mixture-of-experts moe
Language:Python 239
libgdx / gdx-pay
A libGDX cross-platform API for InApp purchasing.
gdx-pay android ios iap in-app-purchase robovm multi-os-engine moe java libgdx
Language:Java 226
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward experts. We released a collection of ModuleFormer-based Language Models (MoLM) ranging in scale from 4 billion to 8 billion parameters.
lm moe
Language:Python 217
inclusionAI / Ling
Ling is a MoE LLM provided and open-sourced by InclusionAI.
llm llm-reasoning machine-learning moe rl
Language:Python 133
shalldie / chuncai
A lovely Page Wizard, is responsible for selling moe.
chuncai moe
Language:TypeScript 112
kyegomez / MoE-Mamba
Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta
ai ml moe multi-modal-fusion multi-modality swarms
Language:Python 101
LISTEN-moe / desktop-app
Official LISTEN.moe Desktop Client
anime app client desktop jpop linux listen macos moe music windows
Language:Vue 97
kyegomez / SwitchTransformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity"
ai gpt4 llama mixture-of-experts ml moe multi-modal mixture-model mixture-of-models
Language:Python 95
simplifine-llm / Simplifine
🚀 Easy, open-source LLM finetuning with one-line commands, seamless cloud integration, and popular optimization frameworks. ✨
cloud fine-tuning fine-tuning-llm finetuning-llms large-language-models llama llm llm-training open-source ai gpt instruction-tuning llama3 lora mistral moe peft phi qwen
Language:Python 90
LINs-lab / DynMoE
[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models
adaptive-computation language-model mixture-of-experts moe multimodal-large-language-models
Language:Python 84
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
megatron megatron-lm transformers 3d-parallelism data-parallelism pipeline-parallelism tensor-parallelism model-parallelism zero-1 large-scale-language-modeling huggingface-transformers distributed-optimizers mixture-of-experts moe sequence-parallelism
Language:Python 81
OpenSparseLLMs / LLaMA-MoE-v2
🚀 LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
llama mixture-of-experts sft moe fine-tuning instruction-tuning llama3 sparsity attention
Language:Python 78
marisukukise / japReader
japReader is an app for breaking down Japanese sentences and tracking vocabulary progress
electron japanese javascript visual-novel anki flashcards furigana ichi japanese-dictionary japanese-language japanese-language-learners japanese-study jmdict language-learning language-learning-tool manga moe progress-tracking ichimoe
Language:JavaScript 77
phanirithvij / twist.moe
Batch download high quality videos from https://twist.moe
anime-downloader anime twist-moe twist-moe-downloader moe
Language:Python 74
ianhom / MOE
MOE is an event-driven OS for 8/16/32-bit MCUs. MOE means "Minds Of Embedded system", It’s also the name of my lovely baby daughter :sunglasses:
moe multi-task event-driven easy-to-use mcu protothreads schedule
Language:C 69
dsrkafuu / moe-counter-cf
Fork of Moe Counter powered by Cloudflare Workers.
counter analytics serverless cloudflare badge asoul gelbooru moe moebooru rule34
Language:TypeScript 64
LISTEN-moe / windows-app
Official LISTEN.moe Windows-only Client
music jpop anime moe windows listen client app
Language:C# 61
dragonzurfer / moe
A command line tool for all things anime
anime anime-info anime-scraper anime-search animes cli command-line command-line-tool go golang mal moe
Language:Go 51
sahuang / priconne-rainbow-fart
会长我挂树了 - 公主连结 vscode-rainbow-fart 扩展语音包 (Priconne extension vocal pack)
moe pcr priconne princess-connect-redive rainbow-fart vscode vscode-rainbow-fart
49

moe

hiyouga / LLaMA-Factory

sgl-project / sglang

czy0729 / Bangumi

PKU-YuanGroup / MoE-LLaVA

MoonshotAI / MoBA

davidmrau / mixture-of-experts

pjlab-sys4nlp / llama-moe

microsoft / Tutel

sail-sg / Adan

open-compass / MixtralKit

ScienceOne-AI / DeepSeek-671B-SFT-Guide

ymcui / Chinese-Mixtral

mindspore-courses / step_into_llm

kokororin / pixiv.moe

SkyworkAI / MoH

LISTEN-moe / android-app

inferflow / inferflow

SkyworkAI / MoE-plus-plus

libgdx / gdx-pay

IBM / ModuleFormer

inclusionAI / Ling

shalldie / chuncai

kyegomez / MoE-Mamba

LISTEN-moe / desktop-app

kyegomez / SwitchTransformers

simplifine-llm / Simplifine

LINs-lab / DynMoE

xrsrke / pipegoose

OpenSparseLLMs / LLaMA-MoE-v2

marisukukise / japReader

phanirithvij / twist.moe

ianhom / MOE

dsrkafuu / moe-counter-cf

LISTEN-moe / windows-app

dragonzurfer / moe

sahuang / priconne-rainbow-fart