Huihuo Zheng's repositories
dlio_ml_workloads
Reference workloads for DLIO Benchmark
horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
dlio_benchmark
An I/O benchmark for deep Learning applications
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
E3SM-IO
Benchmark programs using the I/O pattern of E3SM
exahdf5_sdk
ExaHDF5 project build scripts
dlio-profiler
A low-level profiler for capture I/O calls from deep learning applications.
vol-cache
HDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O overhead.
MLPerf_training
Reference implementations of MLPerf™ training benchmarks
E4S-Documenter
A tool to generate documentation for a project based on project metadata (README, Changelog, License, etc.) stored in a yaml file.
h5bench
A benchmark suite for measuring HDF5 performance.
user-guides
ALCF Systems User Documentation
incubator-mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
dlio_profiling
This repo demonstrate how to profile I/O for deep learning applications. This is based on VaniDL
amrex
AMReX: Software Framework for Block Structured AMR
scorpio
A high-level Parallel I/O Library for structured grid applications
vanidl
VaniDL is an tool for analyzing I/O patterns and behavior with Deep Learning Applications.
vol-async
HDF5 Asynchronous I/O VOL connector that enables asynchronous I/O for HDF5 applications
UnifyFS
UnifyFS: A file system for burst buffers