This repository contains the code to replicate experiments of the anonymous paper "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient".
Instructions to replicate the compression-aware architecture experiments can be found in bottleneck/README.md.
Instructions to replicate the experiments on large-scale language model pretraining and throughput estimation on multiple preemptible nodes, as well as the prototype implementation of mechanisms behind SWARM, are located in the swarm subfolder.