There are 23 repositories under mixture-of-experts topic.
Run Mixtral-8x7B models in Colab or consumer desktops
Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.
Mixture-of-Experts for Large Vision-Language Models
PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training
A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)
中文Mixtral混合专家大模型(Chinese Mixtral MoE LLMs)
GMoE could be the next backbone model for many kinds of generalization task.
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch
A curated reading list of research in Adaptive Computation, Dynamic Compute & Mixture of Experts (MoE).
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models
PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)
Multi-Task Learning package built with tensorflow 2 (Multi-Gate Mixture of Experts, Cross-Stitch, Ucertainty Weighting)
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts" (NeurIPS 2020), original PyTorch implementation
[ICLR 2024 Spotlight] Code for the paper "Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy"
PyTorch library for cost-effective, fast and easy serving of MoE models.
Hierarchical Mixture of Experts,Mixture Density Neural Network
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
[ICML 2022] "Neural Implicit Dictionary via Mixture-of-Expert Training" by Peihao Wang, Zhiwen Fan, Tianlong Chen, Zhangyang Wang
PyTorch Implementation of the Multi-gate Mixture-of-Experts with Exclusivity (MMoEEx)
This is the official repository of the papers "Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers" and "Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters".
Repository for our paper "See More Details: Efficient Image Super-Resolution by Experts Mining"