mixture-of-experts

There are 23 repositories under mixture-of-experts topic.

microsoft / DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
deep-learning pytorch gpu machine-learning billion-parameters data-parallelism model-parallelism inference pipeline-parallelism compression mixture-of-experts trillion-parameters zero
Language:Python 32632
dvmazur / mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
colab-notebook deep-learning google-colab language-model llm mixture-of-experts offloading pytorch quantization
Language:Python 2231
learning-at-home / hivemind
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
asynchronous-programming asyncio deep-learning dht distributed-systems distributed-training hivemind machine-learning mixture-of-experts neural-networks pytorch volunteer-computing
Language:Python 1836
PKU-YuanGroup / MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models
large-vision-language-model mixture-of-experts moe multi-modal
Language:Python 1664
davidmrau / mixture-of-experts
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
moe mixture-of-experts sparsely-gated-mixture-of-experts pytorch re-implementation
Language:Python 825
pjlab-sys4nlp / llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
llama llm mixture-of-experts moe continual-pre-training expert-partition
Language:Python 702
drawbridge / keras-mmoe
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
machine-learning deep-learning data-science deep-neural-networks kdd2018 keras tensorflow multi-task-learning mixture-of-experts
Language:Python 676
microsoft / tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
pytorch moe mixture-of-experts nlp transformer
Language:Python 652
SMTorg / smt
Surrogate Modeling Toolbox
surrogate-models derivative sampling mixture-of-experts multi-fidelity predictive-modeling machine-learning
Language:Jupyter Notebook 621
lucidrains / mixture-of-experts
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
artificial-intelligence deep-learning transformer mixture-of-experts
Language:Python 515
AviSoori1x / makeMoE
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
deep-learning large-language-models llm mixture-of-experts neural-networks pytorch pytorch-implementation
Language:Jupyter Notebook 502
ymcui / Chinese-Mixtral
中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）
32k 64k large-language-models llm mixtral mixture-of-experts moe nlp
Language:Python 490
Luodian / Generalizable-Mixture-of-Experts
GMoE could be the next backbone model for many kinds of generalization task.
deep-learning domain-generalization pytorch pytorch-implementation mixture-of-experts
Language:Python 279
lucidrains / st-moe-pytorch
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
artificial-intelligence conditional-computation deep-learning mixture-of-experts
Language:Python 220
inferflow / inferflow
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
llama2 llamacpp llm-inference model-quantization multi-gpu-inference mixture-of-experts moe gemma falcon minicpm mistral bloom deepseek internlm phi-2 baichuan2 mixtral m2m100 qwen
Language:C++ 212
Leeroo-AI / mergoo
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
artificial-intelligence fine-tuning generative-ai large-language-models llm lora merge mixture-of-adapters mixture-of-experts multi-model nlp open-source transformers
Language:Python 207
lucidrains / soft-moe-pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
artificial-intelligence deep-learning mixture-of-experts transformers
Language:Python 197
efeslab / fiddler
Fast Inference of MoE Models with CPU-GPU Orchestration
llm llm-inference mixtral-8x7b mixture-of-experts local-inference
Language:Python 125
koayon / awesome-adaptive-computation
A curated reading list of research in Adaptive Computation, Dynamic Compute & Mixture of Experts (MoE).
adaptive-computation computer-vision machine-learning mixture-of-experts nlp pytorch tensorflow transformers
96
lucidrains / mixture-of-attention
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
artificial-intelligence attention-mechanisms deep-learning mixture-of-experts routed-attention
Language:Python 91
YangLing0818 / RealCompo
RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
diffusion-models layout-to-image mixture-of-experts text-to-image-diffusion
Language:Python 84
HLTCHKUST / MoEL
MoEL: Mixture of Empathetic Listeners
transformer chatbot empathy dialogue-systems mixture-of-experts dialogue-generation transformer-pytorch
Language:Python 71
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
megatron megatron-lm transformers 3d-parallelism data-parallelism pipeline-parallelism tensor-parallelism model-parallelism zero-1 large-scale-language-modeling huggingface-transformers distributed-optimizers mixture-of-experts moe sequence-parallelism
Language:Python 68
fkodom / soft-mixture-of-experts
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
deep-learning mixture-of-experts pytorch transformer vision-transformer
Language:Python 55
AmazaspShumik / mtlearn
Multi-Task Learning package built with tensorflow 2 (Multi-Gate Mixture of Experts, Cross-Stitch, Ucertainty Weighting)
cross-stitch experts kdd2018 keras mixture-of-experts multi-task-learning multitask-learning papers-reproduced papers-with-code recsys2019 tensorflow2
Language:Python 51
mryab / learning-at-home
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implementation
pytorch deep-learning distributed-systems mixture-of-experts
Language:Jupyter Notebook 50
relf / egobox
Efficient global optimization toolbox in Rust: bayesian optimization, mixture of gaussian processes, sampling methods
global-optimization gaussian-processes mixture-of-experts latin-hypercube-sampling surrogate-models
Language:Rust 47
UNITES-Lab / MC-SMoE
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
efficiency merging mixture-of-experts
Language:Python 47
TorchMoE / MoE-Infinity
PyTorch library for cost-effective, fast and easy serving of MoE models.
inference-engine large-language-models mixture-of-experts huggingface pytorch
Language:C++ 46
AmazaspShumik / Mixture-Models
Hierarchical Mixture of Experts,Mixture Density Neural Network
hierarchical-mixtures-of-experts mixture-of-experts
Language:Jupyter Notebook 45
Leeroo-AI / leeroo_orchestrator
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
ai large-language-models llm mixture-of-experts nlp orchestration-of-experts cost efficiency gpt4 opensource performance
Language:Python 44
VITA-Group / Neural-Implicit-Dict
[ICML 2022] "Neural Implicit Dictionary via Mixture-of-Expert Training" by Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
dictionary-learning mixture-of-experts neural-implicit-representations
Language:Python 31
BorealisAI / MMoEEx-MTL
PyTorch Implementation of the Multi-gate Mixture-of-Experts with Exclusivity (MMoEEx)
mixture-of-experts multi-task-learning pytorch
Language:Python 30
bwconrad / soft-moe
PyTorch implementation of "From Sparse to Soft Mixtures of Experts"
computer-vision machine-learning mixture-of-experts transformer vision-transformer pytorch
Language:Python 27
umbertocappellazzo / PETL_AST
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".
adapter audio-processing audio-spectrogram-transformer lora parameter-efficient-fine-tuning prompt-tuning transfer-learning mixture-of-experts mixture-of-adapters
Language:Python 27
eduardzamfir / seemoredetails
Repository for our paper "See More Details: Efficient Image Super-Resolution by Experts Mining"
computer-vision efficiency low-rank-matrix-decomposition mixture-of-experts super-resolution low-level-vision
26

mixture-of-experts

microsoft / DeepSpeed

dvmazur / mixtral-offloading

learning-at-home / hivemind

PKU-YuanGroup / MoE-LLaVA

davidmrau / mixture-of-experts

pjlab-sys4nlp / llama-moe

drawbridge / keras-mmoe

microsoft / tutel

SMTorg / smt

lucidrains / mixture-of-experts

AviSoori1x / makeMoE

ymcui / Chinese-Mixtral

Luodian / Generalizable-Mixture-of-Experts

lucidrains / st-moe-pytorch

inferflow / inferflow

Leeroo-AI / mergoo

lucidrains / soft-moe-pytorch

efeslab / fiddler

koayon / awesome-adaptive-computation

lucidrains / mixture-of-attention

YangLing0818 / RealCompo

HLTCHKUST / MoEL

xrsrke / pipegoose

fkodom / soft-mixture-of-experts

AmazaspShumik / mtlearn

mryab / learning-at-home

relf / egobox

UNITES-Lab / MC-SMoE

TorchMoE / MoE-Infinity

AmazaspShumik / Mixture-Models

Leeroo-AI / leeroo_orchestrator

VITA-Group / Neural-Implicit-Dict

BorealisAI / MMoEEx-MTL

bwconrad / soft-moe

umbertocappellazzo / PETL_AST

eduardzamfir / seemoredetails