There are 8 repositories under data-parallelism topic.
Making large AI models cheaper, faster and more accessible
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
Distributed Keras Engine, Make Keras faster with only one line of code.
Orkhon: ML Inference Framework and Server Runtime
Distributed training (multi-node) of a Transformer model
This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.
:coffee:Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System
Understanding the effects of data parallelism and sparsity on neural network training
A decentralized and distributed framework for training DNNs
OpenCL powered Merklization using BLAKE3
pipeDejavu: Hardware-aware Latency Predictable, Differentiable Search for Faster Config and Convergence of Distributed ML Pipeline Parallelism
Batch Partitioning for Multi-PE Inference with TVM (2020)
The project is focused on parallelising pre-processing, measuring and machine learning in the cloud, as well as the evaluation and analysis of the cloud performance.
SIMD multithreaded Monte Carlo options pricer in Rust 🦀
Complex ray tracing algorithm optimized by using parallelization over different partitioning schemes and explore the performance gains through grain size and processing units (parameters) over sequential algorithm to render a high resolution image.
Torch Automatic Distributed Neural Network (TorchAD-NN) training library. Built on top of TorchMPI, this module automatically parallelizes neural network training.
Scaling Unet in Pytorch
Sequential and Parallel Implementation of the Hodgkin-Huxley Neuron model.
A SYCL-like kernel compiler for Python
Scaling Unet in Tensorflow
Official Repository for the paper: Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation
Binary data classification using TensorFlow and Keras in python and achieving data parallelism using MPI