Keshi_Ge's repositories
Sketch_Pytorch
Communication efficient Pytorch with Sketch
albert_pytorch
A Lite Bert For Self-Supervised Learning Language Representations
Count-Sketch-Optimizers
A compressed adaptive optimizer for training large-scale deep learning models using PyTorch
darts
Differentiable architecture search for convolutional and recurrent networks
DistributedTest
Some benchmark of distributed training
GeKeShi.github.io
Keshi's blog
learning-to-quantize
Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.
MachineLearning
Machine Learning in Action(机器学习实战)
Megatron-LM
Ongoing research training transformer models at scale
powergossip
Code for "Practical Low-Rank Communication Compression in Decentralized Deep Learning"
powersgd
Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727
python-machine-learning-book
The "Python Machine Learning (1st edition)" book code repository and info resource
pytorch-distributed
A quickstart and benchmark for pytorch distributed training.
PyTorch_GBW_LM
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset
sketchedsgd
Sketched SGD
TensorFlow-Tutorials
Simple tutorials using Google's TensorFlow Framework
tensorflow_multigpu_imagenet
Tensorflow code for training different architectures(DenseNet, ResNet, AlexNet, GoogLeNet, VGG, NiN) on ImageNet dataset + Multi-GPU support + Transfer Learning support
training-bottleneck
Analyze network performance in distributed training