ParCIS Lab, BUPT

ParCIS Lab, BUPT's repositories

Magicube

Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.

Language:C++GPL-3.089 3 2

Chimera

Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.

Language:PythonGPL-3.067 1 4

FlashSparse significantly reduces the computation redundancy for unstructured sparsity (for SpMM and SDDMM) on Tensor Cores through a Swap-and-Transpose mapping strategy. FlashSparse is accepted by PPoPP 2025.

Language:Cuda29 10

Ok-Topk

Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.

Language:PythonGPL-3.027 1 4

DNN-cpp-proxies

C++/MPI proxies for distributed training of deep neural networks.

Language:C++GPL-3.01 20