Min Si's repositories
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
nccl
Optimized primitives for collective multi-GPU communication
nccl-tests
NCCL Tests
tutorials
PyTorch tutorials.
dummy_collectives
A minimum demo for PyTorch c10d extension APIs
param
PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.
torch_ucc
Pytorch process group third-party plugin for UCC
rccl
ROCm Communication Collectives Library (RCCL)
ucc
Unified Communication Collectives Library
gloo
Collective communications library with various primitives for multi-machine training.
mlnx-tools
Mellanox userland tools and scripts
darshan
Darshan I/O characterization tool
yaksa
Yaksa: High-performance Noncontiguous Data Management
libfabric
Open Fabric Interfaces
tests-sos
Sandia OpenSHMEM unit tests and performance testing suite
SOS
Sandia OpenSHMEM is an implementation of the OpenSHMEM specification over multiple Networking APIs, including Portals 4 and the Open Fabric Interface (OFI). Please click on the Wiki tab for help with building and using SOS.
openshmem-specification
OpenSHMEM Application Programming Interface
ucx
Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group):
armci-mpi
An implementation of ARMCI using MPI one-sided communication (RMA)
FAQs
Frequently Asked Questions for the reproducibility initiative of the SC Conference
PiP-glibc
glibc modified for PiP (Process-in-Process )