databricks/megablocks Issues
Bad throughput with GLU
Updated 1support amd/rocm
Updated 3OSError: Stale file handle with dMoE
Updated 3Add a fine-tune script for JetMoE
Updated 2ScatterMoE feature
Updated 5Does this framework support SFT?
Updated 2AMP + BF16 failing
Updated 2selective router precision
Updated 1Docker issues with PyPI installation
Updated 3Comparison against top-2 routing?
Updated 4Efficiency of torch mlp
Closed 2Question on offsets in figures 5
Closed 1About the Multi-node Script
Closed 4Inference code
Closed 5multi-node problem
Closed 5