Gyeongchan-Yun's starred repositories

ml-systems-papers

Curated collection of papers in machine learning systems

Stargazers:84Issues:0Issues:0

Megatron-LLM

distributed trainer for LLMs

Language:PythonLicense:NOASSERTIONStargazers:502Issues:0Issues:0

gpt-pytorch

PyTorch Implementation of OpenAI GPT

Language:PythonLicense:MITStargazers:108Issues:0Issues:0
Language:PythonLicense:MITStargazers:86Issues:0Issues:0

memory_profiler

Monitor Memory usage of Python code

Language:PythonLicense:NOASSERTIONStargazers:4277Issues:0Issues:0
Language:PythonStargazers:4Issues:0Issues:0

GPT2

PyTorch Implementation of OpenAI GPT-2

Language:PythonLicense:Apache-2.0Stargazers:265Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:68Issues:0Issues:0

Oobleck

A resilient distributed training framework

Language:PythonLicense:Apache-2.0Stargazers:71Issues:0Issues:0
Language:PythonStargazers:19Issues:0Issues:0

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:1751Issues:0Issues:0
Language:PythonStargazers:7Issues:0Issues:0

Chimera

Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines.

Language:PythonLicense:GPL-3.0Stargazers:38Issues:0Issues:0

gavel

Code for "Heterogenity-Aware Cluster Scheduling Policies for Deep Learning Workloads", which appeared at OSDI 2020

Language:Jupyter NotebookLicense:MITStargazers:122Issues:0Issues:0

KungFu

Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.

Language:GoLicense:Apache-2.0Stargazers:290Issues:0Issues:0

T2T-ViT

ICCV2021, Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1132Issues:0Issues:0

pytorch-MNIST-CelebA-GAN-DCGAN

Pytorch implementation of Generative Adversarial Networks (GAN) and Deep Convolutional Generative Adversarial Networks (DCGAN) for MNIST and CelebA datasets

Language:PythonStargazers:507Issues:0Issues:0

shockwave

Code for "Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning" [NSDI '23]

Language:PythonLicense:MITStargazers:38Issues:0Issues:0

SimiGrad

Anonymous repo created for NIPS submission SimiGrad

Language:PythonStargazers:1Issues:0Issues:0

Espresso

Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '23)

Language:PythonLicense:NOASSERTIONStargazers:10Issues:0Issues:0

byteps

A high performance and generic framework for distributed DNN training

Language:PythonLicense:NOASSERTIONStargazers:3592Issues:0Issues:0

SimiGrad

Public Code for NIPS submission SimiGrad: Fine-Grained Adaptive Batching for Large ScaleTraining using Gradient Similarity Measurement

Language:PythonStargazers:1Issues:0Issues:0

awesome-gnn-systems

A list of awesome GNN systems.

Language:PythonStargazers:270Issues:0Issues:0

pyg_autoscale

Implementation of "GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings" in PyTorch

Language:PythonLicense:MITStargazers:157Issues:0Issues:0

marius

Large scale graph learning on a single machine.

Language:C++License:Apache-2.0Stargazers:160Issues:0Issues:0

DeepPlan

Fast and Efficient Model Serving Using Multi-GPUs with Direct-Host-Access (ACM EuroSys '23)

Language:C++License:MITStargazers:50Issues:0Issues:0

grace

GRACE - GRAdient ComprEssion for distributed deep learning

Language:PythonLicense:BSD-2-ClauseStargazers:130Issues:0Issues:0

SHADE

SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training

Language:PythonLicense:MITStargazers:28Issues:0Issues:0

adaptdl

Resource-adaptive cluster scheduler for deep learning training.

Language:PythonLicense:Apache-2.0Stargazers:415Issues:0Issues:0

Awesome-ML-for-System

SOTA Learning-augmented Systems

Stargazers:32Issues:0Issues:0